Filtering for Tweets by User Bio

One of the requests we often hear from customers is that they’d like to be able to filter for Tweets from users who match a specific demographic.  I’m excited to announce the addition of a new operator to our PowerTrack suite that enables you to do exactly that.

The bio_contains operator enables you to filter for Tweets from users whose freeform Twitter bio contains a specific keyword, phrase or string.  The operator does a substring match against the user bio, much like our url_contains operator matches against the contents of the URL string.  To use the bio_contains operator, simply add a bio_contains:keyword clause to any rule.

Use Cases
One great use for this operator is to filter for Tweets based on target demographic.  For example, say you’re analyzing social media for Tide laundry detergent and want to see what moms are saying about the brand following a major marketing campaign.  Using the bio_contains operator, you could create a rule to receive Tweets from Twitter users who explicitly state in their bio that they are a mom and mentioned Tide in their Tweet.

Example:
User’s Bio: “Loving Mom, Wife and Daughter”
Tweet: “I love the new Tide!”
Rule: Tide bio_contains:mom

Another use would be to see all Tweets from a competitor’s employees in hopes of gaining some competitive intelligence.  In this use case, I might want to receive ALL tweets from users whose bio mentions ABC Corp.

Example:
User’s Bio: “Product Manager at ABC Corp”
Rule: bio_contains:”ABC Corp”

These are only a few of the possible use cases and we’re sure our customers have many others that would put these to shame.  We’d love to hear about them!

Important Details
The operator does have some intricacies that it is important to be aware of.

  • Unless the bio_contains operator is combined with additional clauses and operators in a rule, the bio_contains operator will match EVERY tweet from a user whose bio contains the keyword or phrase.  Depending on the keyword or phrase, this could result in receiving A LOT of Tweets.
  • All keywords or phrases containing spaces or punctuation should be surrounded by quotes.
  • The operator performs a substring match against a user’s bio and ignores word boundaries.  As a result, if your keyword or phrase is part of another word or phrase, it will be considered a match.  For example, a keyword of “pants” would match a bio containing a term like “#TeamSpongeBobSquarePants”.  Should this be an issue, we would recommend one of two solutions:
  1. Add a negation to exclude the matches you don’t want
    i.e. bio_contains:pants -bio_contains:”#TeamSpongeBobSquarePants”
  2. Quote common word boundaries in conjunction with the OR operator
    i.e. bio_contains:” pants ”  OR bio_contains:”pants/” OR bio_contains:” pants.”

As with most of our work, this new operator started with customer requests.  Thanks for the product feedback and keep it coming.  Additional documentation of this new operator and others can be found in our online documentation. If you’re interested in learning more about how to filter Twitter by bio, please contact sales@gnip.com.

For the Times When Every Tweet is Too Many

Our customers tell us that getting every single Tweet that matters is one of the key reasons they work with Gnip. And sometimes getting every Tweet that matters means filtering out the Tweets you don’t want. With this in mind, I’m happy to announce the introduction of two new operators to our Power Track filtering suite.

Retweet Operator

The Retweet operator allows a customer to ensure only Retweets that match a rule are delivered or excluded.

To use the Retweet operator, simply add is:retweet or –is:retweet to any rule.

Examples Include:

  • Receive only Retweets mentioning Apple using a rule like: apple is:retweet as a way to measure engagement of the brand’s fan base

or

  • Get only Tweets with unique content about Apple using a rule like: apple -is:retweet to monitor conversation about the brand and ignore the tremendous volume of retweets generated by the brand

Sampling Operator

The Sampling operator allows a customer to receive a random sample of Tweets that match a rule rather than the entire set of Tweets.

There are several use cases where the Sampling operator is useful.  Say you want to stay within a budgeted number of Tweets each month, but you’re trending higher than that budget halfway through the month.  With the Sampling operator, you can scale back your consumption without fully eliminating rules.  In another use case you might want to monitor a very high-volume rule or user, but your internal systems can’t handle this volume.  Sampling makes this more manageable.  Finally, there are times when you simply need to know the directional volumes for things, and don’t need every tweet.

To use the Sampling operator, add sample:## to any rule with an integer value between 1 to 100. The Sampling operator applies to the entire rule and requires any “OR’d” terms be grouped.

Examples Include:

  • Receive a sampling of 10% of all Tweets that contain “apple” using a rule like:

apple sample:10

or

  • Receive a sampling of 50% of all Tweets that contain “iPad” or “iPhone” using a rule like:

(ipad OR iphone) sample:50

As always, thank you for the product feedback and keep it coming.  Additional documentation of these new operators and others can be found in our online documentation.