Profile Geo: When You Need More Geodata In Your Twitter Data

Sometimes in the world of social data it is hard to grasp the amazing possibilities when we use words to describe things. The old adage that a picture is worth a thousand words is true, so we wanted to show you what our new Profile Geo enrichment does.

First, here is what Profile Geo is:
Gnip’s Profile Geo enrichment significantly increases the amount of usable geodata for Twitter. It normalizes unstructured location data from Twitter users’ bio locations and matches those latitude/longitude coordinates for those normalized places. For examples, everyone who mentions “NYC,” New York City,” “Manhattan,” and even some odd instances like “NYC Baby✌” all get normalized to “New York City, New York, United States” so they’re easy to map.

Now, here is what Profile Geo does in practice for users interested in Twitter geodata:
Football Geo

We think this is really powerful stuff. These maps were created using 2 sets of Tweets taken over 3 Sundays where we were looking for Tweets containing the term “football.” The map for Standard Geo is comprised of Tweets where users specifically geotagged their Tweet with their latitude and longitude (natively in the Twitter payload). The map for Profile Geo is comprised of Tweets where Gnip was able to enrich additional Tweets and assign the Tweet to a latitude and longitude.

As you can see the amount of location data available through Profile Geo is significantly higher than through Standard Geo. To be specific, we did our “football” search using the Decahose, a random sampling of 10% of the full Twitter firehose. Standard Geo returned just under 3,000 Tweets, while the Profile Geo search returned more than 40,000 Tweets! (Multiply those by 10 to get approximations of firehose volumes) With this additional geodata the possibilities are limitless. The NFL can understand the demographics of their demand better, football clubs in the UK can see how far their reach is, TV networks can use this data to tailor media, among infinite other uses.

If you were to remove the search for “football” and use the entire firehose of Twitter data you’d find that you can receive roughly 15 times the amount of geo-relevant data by using Gnip’s Profile Geo enrichment instead of just the geodata in the standard stream. Anyone using geodata in their social data analyses should find great value in this dramatic increase in georelevant data.

If images are better than words, then interactive maps are better than images. Here are the maps so you can play around and see the difference yourself. Zooming in will depict just how much more data is available with Profile Geo in clear detail:
Continue reading

Filtering for Tweets by User Bio

One of the requests we often hear from customers is that they’d like to be able to filter for Tweets from users who match a specific demographic.  I’m excited to announce the addition of a new operator to our PowerTrack suite that enables you to do exactly that.

The bio_contains operator enables you to filter for Tweets from users whose freeform Twitter bio contains a specific keyword, phrase or string.  The operator does a substring match against the user bio, much like our url_contains operator matches against the contents of the URL string.  To use the bio_contains operator, simply add a bio_contains:keyword clause to any rule.

Use Cases
One great use for this operator is to filter for Tweets based on target demographic.  For example, say you’re analyzing social media for Tide laundry detergent and want to see what moms are saying about the brand following a major marketing campaign.  Using the bio_contains operator, you could create a rule to receive Tweets from Twitter users who explicitly state in their bio that they are a mom and mentioned Tide in their Tweet.

Example:
User’s Bio: “Loving Mom, Wife and Daughter”
Tweet: “I love the new Tide!”
Rule: Tide bio_contains:mom

Another use would be to see all Tweets from a competitor’s employees in hopes of gaining some competitive intelligence.  In this use case, I might want to receive ALL tweets from users whose bio mentions ABC Corp.

Example:
User’s Bio: “Product Manager at ABC Corp”
Rule: bio_contains:”ABC Corp”

These are only a few of the possible use cases and we’re sure our customers have many others that would put these to shame.  We’d love to hear about them!

Important Details
The operator does have some intricacies that it is important to be aware of.

  • Unless the bio_contains operator is combined with additional clauses and operators in a rule, the bio_contains operator will match EVERY tweet from a user whose bio contains the keyword or phrase.  Depending on the keyword or phrase, this could result in receiving A LOT of Tweets.
  • All keywords or phrases containing spaces or punctuation should be surrounded by quotes.
  • The operator performs a substring match against a user’s bio and ignores word boundaries.  As a result, if your keyword or phrase is part of another word or phrase, it will be considered a match.  For example, a keyword of “pants” would match a bio containing a term like “#TeamSpongeBobSquarePants”.  Should this be an issue, we would recommend one of two solutions:
  1. Add a negation to exclude the matches you don’t want
    i.e. bio_contains:pants -bio_contains:”#TeamSpongeBobSquarePants”
  2. Quote common word boundaries in conjunction with the OR operator
    i.e. bio_contains:” pants ”  OR bio_contains:”pants/” OR bio_contains:” pants.”

As with most of our work, this new operator started with customer requests.  Thanks for the product feedback and keep it coming.  Additional documentation of this new operator and others can be found in our online documentation. If you’re interested in learning more about how to filter Twitter by bio, please contact sales@gnip.com.

Tumblr Firehose Now Available Exclusively from Gnip

I’m thrilled to announce that the full firehose of public Tumblr posts is now available exclusively from Gnip. Tumblr is one of the fastest growing social networks in the world. Much of this growth is fueled by the enormous number of conversations that are unique to the Tumblr community. These conversations cover a huge range of subjects, from movies, TV shows and fashion to business, apparel and consumer products. Check out these stats to get a feel for the volume of discussion on Tumblr:

  • 50 million new posts every day
  • 15 billion page views every month
  • 20 billion total posts
  • 300% traffic growth last year

While some social platforms react quickly to news and other events, Tumblr conversations often spread around concepts and trends. Take the example of Urban Outfitters where a photographer posted a picture to her personal Tumblr of a piece from one of their new collections. That post received over 1,000 notes and almost no mention elsewhere. In the case of Land Rover, the company posted a picture of a dog riding in a Land Rover to their Tumblr that received more than 5,000 notes and very little mention on other networks.

It doesn’t take a large leap to see the impact this type of information can have on brand management and product development. The conversations on Tumblr are rich in images and discussion about brands and products, from simply sharing a picture about a favorite pair of shoes to reblogging news about favorite brand. And given the highly social nature of the Tumblr community, these discussions move quickly and broadly through the community. You often see posts that are shared tens of thousands of times. For brands, every conversation matters and access to the full firehose ensures they won’t miss a thing.

We’re excited to be able to offer Tumblr to our customers and can’t wait to see what other intriguing use cases they find for this data.

Drop us a line at sales@gnip.com to learn more.

Enhanced Filtering for Power Track

Gnip is always looking for ways to improve its filtering capabilities and customer feedback plays a huge role in these efforts.  We are excited to announce enhancements to our PowerTrack product that allow for more precise filtering of the Twitter Firehose, a feature enhancement request that came directly from you, our customers.

Gnip PowerTrack rules now support OR and Grouping using ().  We have also loosened limitations on the number of characters and the number of clauses per rule. Specifically, a single rule can now include up to 10 positive clauses and up to 50 negative clauses (previously 10 total clauses).  Additionally, the character limit per rule has grown from 255 characters to 1024.

With these changes, we are now able to offer our customers a much more robust and precise filtering language to ensure you receive the Tweets that matter most to you and your business.  However, these improvements bring their own set of specific constraints that are important to be aware of.  Examples and details on these limitations are as follows:

OR and Grouping Examples

  • apple OR microsoft
  • apple (iphone OR ipad)
  • apple computer –(fruit OR green)
  • (apple OR mac) (computer OR monitor) new –fruit
  • (apple OR android) (ipad OR tablet) –(fruit green microsoft)

Character Limitations

  • A single rule may contain up to 1024 characters including operators and spaces.

Limitations

  • A single rule must contain at least 1 positive clause
  • A single rule supports a max of 10 positive clauses throughout the rule
  • A single rule supports max of 50 negative clauses throughout the rule
  • Negated ORs are not allowed. The following are examples of invalid rules:
  • -iphone OR ipad
  • ipad OR -(iphone OR ipod)

Precedence

  • An implied “AND” takes precedence in rule evaluation over an OR

For example a rule of:

  • android OR iphone ipad  would be evaluated as apple OR (iphone ipad)
  • ipad iphone OR android would be evaluated as (iphone ipad) OR android

You can find full details of the Gnip Power Track filtering changes in our online documentation.

Know of another way we can improve our filtering to meet your needs?  Let us know in the comments below.

Gnip and Automattic Make Whole New Universe of Data Available

“This new data from Automattic is a big addition and a testament to Gnip’s commitment to drive the social data economy forward. This is an important source to add to the social data mix, one that we know our customers will take full advantage of.”

- Rob Begg, VP Marketing of Radian6

As social media data becomes more and more important across a range of businesses, our customers are asking for access to more data sources to give them a more complete picture of the social media conversations that are relevant to their businesses.

Today, we’re excited to announce a major addition to our coverage of the conversations taking place on blogs around the world. We’re expanding our relationship with Automattic to make a whole new universe of blog and comment data available to the market for the first time anywhere.

For those who don’t know, Automattic is a network of web services including WordPress.com, VIP hosting and support, Polldaddy, IntenseDebate, and Jetpack. We’ve been delivering data from WordPress.com and IntenseDebate for about a year and a half and found that while our customers loved their data, they always wanted more.

As of today, we are now offering the full firehose of blog posts and comments from Jetpack-powered WordPress.org sites, as well as engagement streams of “likes” from WordPress.com and IntenseDebate. The new data from WordPress.org greatly increases the coverage available to those who are looking to do deep analysis of blog posts and comments. The new engagement streams enable companies to pull in reaction data to quickly understand sentiment, relevance and resonance. With this they can gauge the intensity of opinion around fast moving blog and comment conversations, helping prioritize critical response.

Being full firehoses, all of the streams from Automattic ensure 100% coverage in realtime giving customers the peace of mind that they can keep up the entire discussion on fast moving threads.

The scope of coverage offered by Automattic is pretty incredible.  Check out some of these stats:

We’re thrilled to be able to offer these new data streams to our customers and can’t wait to see the amazing things they’ll be able to do with them.

Updated: Coverage in GigaOM – Gnip and WordPress deepen ties, expand data partnership

Launching Gnip MarketStream & Partnership with StockTwits

While the market has been on its roller coaster ride across the past month, Gnip has kept its collective head down and stayed busy on behalf of our Investment Management clients (hedge funds, HFTs, asset managers, etc.). That hard work has paid off and we have two exciting announcements to make today.

  • Launch of Gnip MarketStream: Our hedge fund clients have been quite vocal in their desire for a package incorporating the most relevant social data streams into a single low-latency, high-volume solution. We’re proud to answer their needs with the launch of Gnip MarketStream, a realtime data solution that packages the incredibly rich and broad “voice of the market” Twitter stream with the uniquely deep and targeted “voice of the trader” StockTwits stream.
  • Premium Partnership with StockTwits: An integral component of the Gnip MarketStream is StockTwits social media data. We’re thrilled to announce this partnership with StockTwits, the leading realtime financial platform for the investment community and creator of the $(TICKER) tag. The StockTwits stream is a curated, defined-demographic, realtime social data stream focused on investment decisions and analysis. Gnip now provides streaming access to the full StockTwits firehose of social data, and offers access to historical content as far back as 2009.

While the use of social media data by the investment community has included use of this data in news analysis and equity research, the primary adoption of this data across the last six months has been as a trading indicator. By combining the strengths of both the Twitter stream and the StockTwits stream, Gnip MarketStream provides investment professionals unparalleled access to relevant social data at time when social media has become an increasingly vital channel for news and market sentiment.

For more information about Gnip MarketStream or StockTwits data, contact trading@gnip.com.

Google+ Now Available from Gnip

Gnip is excited to announce the addition of Google+ to its repertoire of social data sources. Built on top of the Google+ Search API, Gnip’s stream allows its customers to consume realtime social media data from Google’s fast-growing social networking service. Using Gnip’s stream, customers can poll Google+ for public posts and comments matching the terms and phrases relevant to their business and client needs.

Google+ is an emerging player in the social networking space that is a great pairing with the Twitter, Facebook, and other microblog content currently offered by Gnip. If you are looking for volume, Google+ quickly became the third largest social networking platform within a week of its public launch and some are projecting it to emerge as the world’s second largest social network within the next twelve months. Looking to consume content from social network influencers? Google+ is where they are! (even former Facebook President Sean Parker says so).

By working with Gnip along with a stream of Google+ data (and the availability of an abundance of other social data sources), you’ll have access to a normalized data format, unwound URLs, and data deduplication. Existing Gnip customers can seamlessly add Google+ to their Gnip Data Collectors (all you need is a Google API Key). New to Gnip? Let us help you design the right solution for your social data needs, contact sales@gnip.com.

Incredible Innovation in Boulder Valley – Results from the IQ Awards

One of the reasons that we at love working in the Boulder Valley is because of the incredible and talented companies that make up the local business ecosystem. Given the depth and quality of innovative organizations that make Boulder their home, we’re extremely excited and very honored to announce today that we’ve won the Boulder County Business Report (BCBR) Innovative Quotient (IQ) Award for Social Media/Mobile Applications.

Presented by the BCBR, the IQ Awards is an annual event that honors the most innovative new products and services developed by companies and organizations, with a special emphasis on advanced technologies, innovations within a particular business sector and sustainable business practices.

Congratulations to all of last nights winners, with a big shout out to our fellow Foundry family member Standing Cloud who won the award in the Internet/Web category. Below is a list of companies that were recognized and their respective categories:

     

  • Green/Sustainability: OPX Biotechnologies Inc.
  • Social Media/Mobile Applications: Gnip Inc.
  • Nonprofits: Safehouse Progressive Alliance for Nonviolence
  • Software: Accurence Inc.
  • Natural Products: Cooper Tea Co. and Third Street Chai
  • Sports & Outdoors: Crescent Moon Snowshoes
  • Consumer Products & Services: Agloves
  • Internet/Web: Standing Cloud Inc.
  • Innovation Accelerator: Boulder Innovation Center and Longmont Entrepreneurial Network
  • Business Products & Services: Radish Systems LLC



Thank you to the Boulder County Business Report for recognizing the amazing innovation that exists in our community and congrats again to all of our fellow winners! Keep the innovation flowing, Boulder.

For more info, check out our press release.

 

Customer Spotlight – Klout

Providing Klout Scores, a measurement of a user’s overall online influence, for every individual in the exponentially ever-growing base of Twitter users was the task at hand for Matthew Thomson, VP of Platform at Klout. With massive amounts of data flowing in by the second, Thomson and Klout’s scientists and engineers needed a fast and reliable solution for processing, filtering, and eliminating data from the Twitter Firehose that was unnecessary for calculating and assigning Twitter users’ Klout Scores

“Not only has Gnip helped us triple our API volume in less than one month but they provided us with a trusted social media data delivery platform necessary for efficiently scaling our offerings and keeping up with the ever-increasing volume of Twitter users.”

- Matthew Thomson
VP of Platform, Klout

By selecting Gnip as their trusted premium Twitter data delivery partner, Klout tripled their API volume and increased their ability to provide influence scores by 50 percent among Twitter users in less than one month.

Get the full detail, read the success story here.

Customer Spotlight – MutualMind

 
Like many startups seeking to enter and capitalize on the rising social media marketplace, timing is everything. MutualMind was no exception: getting their enterprise social media management product to market in a timely manner was crucial to the success of their business. MutualMind provides an enterprise social media intelligence and management system that monitors, analyzes, and promotes brands on social networks and helps increase social media ROI. The platform enables customers to listen to discussion on the social web, gauge sentiment, track competitors, identify and engage with influencers, and use resulting insights to improve their overall brand strategy.

“Through their social media API, Gnip helped us push our product to market six months ahead of schedule, enabling us to capitalize on the social media intelligence space. This allowed MutualMind to focus on the core value it adds by providing advanced analytics, seamless engagement, and enterprise-grade social management capabilities.”

- Babar Bhatti
CEO, MutualMind

By selecting Gnip as their data delivery partner, MutualMind was able to get their product to market six months ahead of schedule. Today, MutualMind processes tens of millions of data activities per month using multiple sources from Gnip including premium Twitter data, YouTube, Flickr, and more.
 
Get the full detail, read the success story here.