Working Directly With the Twitter Data Ecosystem

One of the reasons Twitter acquired Gnip was because Twitter believes the best way to support the distribution of Twitter data is to have direct data relationships with its data customers – the companies building analytic solutions using Twitter’s data and platform. Direct relationships help Twitter develop a deeper understanding of customer needs, get direct feedback for the product roadmap, and work more closely with data customers to enable the best possible solutions for the brands that rely on Twitter data to make better decisions. At Twitter’s Analyst Day last November, Twitter’s VP of Data Strategy noted that when Twitter acquired Gnip, Gnip had the clear majority of Twitter’s data reseller business – the rest was held by the other two data resellers, DataSift and NTT Data. The acquisition of Gnip was the first step toward developing more direct relationships with data customers.

The next step in working directly with data customers is to transition everyone receiving raw data for commercial use from other data resellers to a direct relationship with Twitter. Twitter immediately started this transition process after acquiring Gnip last May, and we expect to finish the transition by the middle of August this year.

After that transition is completed, companies using raw Twitter data for commercial use – to build products, to analyze internally, and to serve other commercial purposes – will need to have a direct relationship with Twitter. For current Twitter partners and customers, it’s business as usual – they will continue to consume the same data they currently do from Twitter’s APIs. For customers who are still working on transitioning, that process will simply require you to begin consuming data via a relationship with Twitter instead of a reseller.

If you’re one of the companies still working on the transition from a data reseller to Twitter’s Commercial APIs through the Gnip product suite or to the Public API, we want to make sure you have the resources you need to successfully complete your transition over the next four months. Here are several channels you can turn to for help.

  1. Technical docs: You can find detailed technical documentation for all of our commercial products here.
  2. Technical webinars: We’ll be hosting technical webinars – covering the topics below – to help you transition to Twitter’s suite of APIs.
    1. Overview of Twitter’s commercial data platform
    2. Using Twitter’s real-time data products
    3. Using Twitter’s historical data products
    4. Tips and tricks to filter for the data you need
    5. Using Twitter’s Public APIs
  3. Office hours: We’ll hold weekly office hours with our product and support teams to answer specific customer questions.
  4. Transition team: We have a dedicated transition team available to answer your questions at

We’ll announce the dates and times for the webinars and office hours on this blog, so keep an eye on this space and follow @Gnip for the latest updates.

Twitter, you've come a long way baby… #8years

Like a child’s first steps or your first experiment with pop rocks candy, the first ever Tweet went down in the Internet history books eight years ago today. On March 21st, 2006, Jack Dorsey, co-founder of Twitter published this.


Twttr, (the service’s first name), was launched to the public on July 15, 2006 where it was recognized for “good execution on a simple but viral idea.” Eight years later, that seems to have held true.

It has become the digital watering hole, the newsroom, the customer service do’s and don’ts, a place to store your witty jargon that would just be weird to say openly at your desk. And then there is that overly happy person you thought couldn’t actually exist, standing in front of you in line, and you just favorited their selfie #blessed. Well, this is awkward.

Just eight months after their release, the company made a sweeping entrance into SXSW 2007 sparking the platforms usage to balloon from 20,000 to 60,000 Tweets per day. Thus beginning the era of our public everyday lives being archived in 140 character tidbits. The manual “RT” turned into a click of a button, and favorites became the digital head nod. I see you.

In April 2009, Twitter launched the Trending Topics sidebar, identifying popular current world events and modish hashtags. Verified accounts became available that summer; Athletes, actors, and icons alike began to display the “verified account” tag on their Twitter pages. This increasingly became a necessity in recognizing the real Miley Cyrus vs. Justin Bieber. If differences do exist.

The Twitter Firehose launched in March 2010. By giving Gnip access, a new door had opened into the social data industry and come November, filtered access to social data was born. Twitter turned to Gnip to be their first partner serving the commercial market. By offering complete access to the full firehose of publicly-available Tweets under enterprise terms, this partnership enabled companies to build more advanced analytics solutions with the knowledge that they would have ongoing access to the underlying data. This was a key inflection point in the growth of the social data ecosystem. By April, Gnip played a key role in the delivering past and future Twitter data to the Library of Congress for historic preservation in the archives.

July 31, 2010, Twitter hit their 20 billionth Tweet milestone, or as we like to call it, twilestone. It is the platform of hashtags and Retweets, celebrities and nobodies, at-replies, political rants, entertainment 411 and “pics or it didn’t happen.” By June 1st, 2011, Twitter allowed just that as it broke into the photo sharing space, allowing users to upload their photos straight to their personal handle.

One of the most highly requested features was the ability to get historical Tweets. In March 2012, Gnip delivered just that by making every public Tweet available starting from March 21, 2006 by Mr. Dorsey himself.

Fast forward 8 years, Twitter is reporting over 500 million Tweets per day. That’s more than 25,000 times the amount of Tweets-per-day in just 8 years! With over 2 billion accounts, over a quarter of the world’s population, Twitter ranks high among the top websites visited everyday. Here’s to the times where we write our Twitter handle on our conference name tags instead of our birth names, and prefer to be tweeted at than texted. Voicemails? Ain’t nobody got time for that.

Twitter launched a special surprise for its 8th birthday. Want to check out your first tweet?

“There’s a #FirstTweet for everything.” Happy Anniversary!


See more memorable Twitter milestones

100 Billion Social Data Activities Delivered Each Month

Gnip has hit another big milestone — we’re now delivering 100 billion social data activities each month. In comparison, we were delivering 30 billion social data activities back in November. We’ve more than tripled the data delivered in a handful of short months.

What is the cause for all of this growth? Three reasons:

  1. Enterprise providers continue to rapidly adopt social data into their offerings. As such, our growth rate for new customers continues to accelerate.
  2. Companies are expanding their insight and analysis offerings over a broader spectrum of social conversation. We’ve added three premium full firehoses of data this year including Tumblr, WordPress and Disqus, as well as other sought after data sources such as Sina Weibo.
  3. The number of supported use cases for social data continues to expand beyond traditional brand monitoring. We see the use cases for social data evolving all of the time and have seen a substantial uptick in social data being used in finance and business intelligence specifically.

If you’re interested in working at a company with Big Software, Big Jobs and Big Impact, contact And if you want to talk about how your company can use social data, contact


Big Boulder Speakers Using Social Data in Innovative Ways

Big Boulder is next week and we’re excited to add four new speakers who are using social data in amazing ways, from disaster response and epidemic tracking to predicting the stock market and monitoring political developments.

If you want to follow the conversation about Big Boulder, be sure to follow the hashtag #BigBoulder , the Gnip blog for live blogging and pictures from the conference on our Facebook page.

Big Boulder is Definitely Going to be Big

Big Boulder is two weeks away and everything is really coming together beautifully. The world’s first conference on social data already has top-notch speakers such as Ryan Sarver of Twitter, Joe Fernandez of Klout and Sean Bruich of Facebook. But…we’re not done yet! Today we’re excited to announce ten new speakers who are leading the world in social data innovation.

We’re excited to announce:

Our conference hotel, the St Julien is completely booked for Thursday night, but we’ve made accommodations at other nearby hotels. See our venue page for more details!


Gnip Named Top 50 Colorado Company To Watch

It is always nice to get recognition in your own backyard and Gnip is excited and humbled to be named a Top 50 Colorado Company to Watch.  The past four years have been an amazing journey and while we’re particularly excited by all we’ve recently accomplished, we’re even more excited for what’s ahead. We believe social data has unlimited potential and we are excited to be driving the adoption of this data across the world from our home in Boulder, Colorado.

One of the criteria for being selected for this award is that your company is creating quality jobs. If you’re interested in working on challenging problems at a company that was named the best place to work in Boulder, we’d love for you to join us.

New Big Boulder Speakers Announced

Big Boulder is just over a month away, and we’re excited to announce seven incredible new speakers to the Big Boulder agenda. When we started planning the first social data conference, we wanted to put together a world class speaker list. We’ve been thrilled by the response and are excited to add speakers from companies such as Tumblr and Get Satisfaction. We’re also working on some really interesting panels so keep your eye out for more to come!

Below is a list of our latest additions, and you can also see the complete list of speakers.

If you’d like to attend, but aren’t a Gnip customer, we’re looking for volunteers to help with photography and live blogging.

Tumblr Firehose Now Available Exclusively from Gnip

I’m thrilled to announce that the full firehose of public Tumblr posts is now available exclusively from Gnip. Tumblr is one of the fastest growing social networks in the world. Much of this growth is fueled by the enormous number of conversations that are unique to the Tumblr community. These conversations cover a huge range of subjects, from movies, TV shows and fashion to business, apparel and consumer products. Check out these stats to get a feel for the volume of discussion on Tumblr:

  • 50 million new posts every day
  • 15 billion page views every month
  • 20 billion total posts
  • 300% traffic growth last year

While some social platforms react quickly to news and other events, Tumblr conversations often spread around concepts and trends. Take the example of Urban Outfitters where a photographer posted a picture to her personal Tumblr of a piece from one of their new collections. That post received over 1,000 notes and almost no mention elsewhere. In the case of Land Rover, the company posted a picture of a dog riding in a Land Rover to their Tumblr that received more than 5,000 notes and very little mention on other networks.

It doesn’t take a large leap to see the impact this type of information can have on brand management and product development. The conversations on Tumblr are rich in images and discussion about brands and products, from simply sharing a picture about a favorite pair of shoes to reblogging news about favorite brand. And given the highly social nature of the Tumblr community, these discussions move quickly and broadly through the community. You often see posts that are shared tens of thousands of times. For brands, every conversation matters and access to the full firehose ensures they won’t miss a thing.

We’re excited to be able to offer Tumblr to our customers and can’t wait to see what other intriguing use cases they find for this data.

Drop us a line at to learn more.

For the Times When Every Tweet is Too Many

Our customers tell us that getting every single Tweet that matters is one of the key reasons they work with Gnip. And sometimes getting every Tweet that matters means filtering out the Tweets you don’t want. With this in mind, I’m happy to announce the introduction of two new operators to our Power Track filtering suite.

Retweet Operator

The Retweet operator allows a customer to ensure only Retweets that match a rule are delivered or excluded.

To use the Retweet operator, simply add is:retweet or –is:retweet to any rule.

Examples Include:

  • Receive only Retweets mentioning Apple using a rule like: apple is:retweet as a way to measure engagement of the brand’s fan base


  • Get only Tweets with unique content about Apple using a rule like: apple -is:retweet to monitor conversation about the brand and ignore the tremendous volume of retweets generated by the brand

Sampling Operator

The Sampling operator allows a customer to receive a random sample of Tweets that match a rule rather than the entire set of Tweets.

There are several use cases where the Sampling operator is useful.  Say you want to stay within a budgeted number of Tweets each month, but you’re trending higher than that budget halfway through the month.  With the Sampling operator, you can scale back your consumption without fully eliminating rules.  In another use case you might want to monitor a very high-volume rule or user, but your internal systems can’t handle this volume.  Sampling makes this more manageable.  Finally, there are times when you simply need to know the directional volumes for things, and don’t need every tweet.

To use the Sampling operator, add sample:## to any rule with an integer value between 1 to 100. The Sampling operator applies to the entire rule and requires any “OR’d” terms be grouped.

Examples Include:

  • Receive a sampling of 10% of all Tweets that contain “apple” using a rule like:

apple sample:10


  • Receive a sampling of 50% of all Tweets that contain “iPad” or “iPhone” using a rule like:

(ipad OR iphone) sample:50

As always, thank you for the product feedback and keep it coming.  Additional documentation of these new operators and others can be found in our online documentation.


Enhanced Filtering for Power Track

Gnip is always looking for ways to improve its filtering capabilities and customer feedback plays a huge role in these efforts.  We are excited to announce enhancements to our PowerTrack product that allow for more precise filtering of the Twitter Firehose, a feature enhancement request that came directly from you, our customers.

Gnip PowerTrack rules now support OR and Grouping using ().  We have also loosened limitations on the number of characters and the number of clauses per rule. Specifically, a single rule can now include up to 10 positive clauses and up to 50 negative clauses (previously 10 total clauses).  Additionally, the character limit per rule has grown from 255 characters to 1024.

With these changes, we are now able to offer our customers a much more robust and precise filtering language to ensure you receive the Tweets that matter most to you and your business.  However, these improvements bring their own set of specific constraints that are important to be aware of.  Examples and details on these limitations are as follows:

OR and Grouping Examples

  • apple OR microsoft
  • apple (iphone OR ipad)
  • apple computer –(fruit OR green)
  • (apple OR mac) (computer OR monitor) new –fruit
  • (apple OR android) (ipad OR tablet) –(fruit green microsoft)

Character Limitations

  • A single rule may contain up to 1024 characters including operators and spaces.


  • A single rule must contain at least 1 positive clause
  • A single rule supports a max of 10 positive clauses throughout the rule
  • A single rule supports max of 50 negative clauses throughout the rule
  • Negated ORs are not allowed. The following are examples of invalid rules:
  • -iphone OR ipad
  • ipad OR -(iphone OR ipod)


  • An implied “AND” takes precedence in rule evaluation over an OR

For example a rule of:

  • android OR iphone ipad  would be evaluated as apple OR (iphone ipad)
  • ipad iphone OR android would be evaluated as (iphone ipad) OR android

You can find full details of the Gnip Power Track filtering changes in our online documentation.

Know of another way we can improve our filtering to meet your needs?  Let us know in the comments below.