Gnip and Twitter join forces

Gnip is one of the world’s largest and most trusted providers of social data. We partnered with Twitter four years ago to make it easier for organizations to realize the benefits of analyzing data across every public Tweet. The results have exceeded our wildest expectations. We have delivered more than 2.3 trillion Tweets to customers in 42 countries who use those Tweets to provide insights to a multitude of industries including business intelligence, marketing, finance, professional services, and public relations.

Today I’m pleased to announce that Twitter has agreed to acquire Gnip! Combining forces with Twitter allows us to go much faster and much deeper. We’ll be able to support a broader set of use cases across a diverse set of users including brands, universities, agencies, and developers big and small. Joining Twitter also provides us access to resources and infrastructure to scale to the next level and offer new products and solutions.

This acquisition signals clear recognition that investments in social data are healthier than ever. Our customers can continue to build and innovate on one of the world’s largest and most trusted providers of social data and the foundation for innovation is now even stronger. We will continue to serve you with the best data products available and will be introducing new offerings with Twitter to better meet your needs and help you continue to deliver truly innovative solutions.

Finally, a huge thank you to the team at Gnip who have poured their hearts and souls into this business over the last 6 years. My thanks to them for all the work they’ve done to get us to this point.

We are excited for this next step and look forward to sharing more with you in the coming months. Stay tuned!

Social Data: What’s Next in Finance?

After a couple exciting years in social finance and some major events, we’re back with an update to our previous paper “Social Media in Markets: The New Frontier”. We’re excited to be able to provide this broad update on a rapidly evolving and increasingly important segment of financial service.

Social media analytics for finance has lagged brand analytics by 3 to 4 years despite being an enormous potential for profit through investing based on social insights. Our whitepaper explains why that gap has existed and what has changed in the social media ecosystem that is causing that gap to close. Twitter conversation around tagged equities has grown by more than 500% since 2011. The whitepaper explores what that means for investors.


We examine the finance specific tools that have emerged as well as outline a framework for unlocking the value in social data for tools that are yet to be created. Then we provide an overview of changes in academic research, social content, and social analytics for finance providers that will help financial firms figure out how to capitalize on opportunities to generate alpha.

Download our new whitepaper.

Twitter, you've come a long way baby… #8years

Like a child’s first steps or your first experiment with pop rocks candy, the first ever Tweet went down in the Internet history books eight years ago today. On March 21st, 2006, Jack Dorsey, co-founder of Twitter published this.


Twttr, (the service’s first name), was launched to the public on July 15, 2006 where it was recognized for “good execution on a simple but viral idea.” Eight years later, that seems to have held true.

It has become the digital watering hole, the newsroom, the customer service do’s and don’ts, a place to store your witty jargon that would just be weird to say openly at your desk. And then there is that overly happy person you thought couldn’t actually exist, standing in front of you in line, and you just favorited their selfie #blessed. Well, this is awkward.

Just eight months after their release, the company made a sweeping entrance into SXSW 2007 sparking the platforms usage to balloon from 20,000 to 60,000 Tweets per day. Thus beginning the era of our public everyday lives being archived in 140 character tidbits. The manual “RT” turned into a click of a button, and favorites became the digital head nod. I see you.

In April 2009, Twitter launched the Trending Topics sidebar, identifying popular current world events and modish hashtags. Verified accounts became available that summer; Athletes, actors, and icons alike began to display the “verified account” tag on their Twitter pages. This increasingly became a necessity in recognizing the real Miley Cyrus vs. Justin Bieber. If differences do exist.

The Twitter Firehose launched in March 2010. By giving Gnip access, a new door had opened into the social data industry and come November, filtered access to social data was born. Twitter turned to Gnip to be their first partner serving the commercial market. By offering complete access to the full firehose of publicly-available Tweets under enterprise terms, this partnership enabled companies to build more advanced analytics solutions with the knowledge that they would have ongoing access to the underlying data. This was a key inflection point in the growth of the social data ecosystem. By April, Gnip played a key role in the delivering past and future Twitter data to the Library of Congress for historic preservation in the archives.

July 31, 2010, Twitter hit their 20 billionth Tweet milestone, or as we like to call it, twilestone. It is the platform of hashtags and Retweets, celebrities and nobodies, at-replies, political rants, entertainment 411 and “pics or it didn’t happen.” By June 1st, 2011, Twitter allowed just that as it broke into the photo sharing space, allowing users to upload their photos straight to their personal handle.

One of the most highly requested features was the ability to get historical Tweets. In March 2012, Gnip delivered just that by making every public Tweet available starting from March 21, 2006 by Mr. Dorsey himself.

Fast forward 8 years, Twitter is reporting over 500 million Tweets per day. That’s more than 25,000 times the amount of Tweets-per-day in just 8 years! With over 2 billion accounts, over a quarter of the world’s population, Twitter ranks high among the top websites visited everyday. Here’s to the times where we write our Twitter handle on our conference name tags instead of our birth names, and prefer to be tweeted at than texted. Voicemails? Ain’t nobody got time for that.

Twitter launched a special surprise for its 8th birthday. Want to check out your first tweet?

“There’s a #FirstTweet for everything.” Happy Anniversary!


See more memorable Twitter milestones

Plugging In Deeper: New Brandwatch API Integration Brings Gnip Data to Brands

Earlier today, our partner Brandwatch made an announcement that we expect to be a big deal for the social data ecosystem. Brandwatch has become Gnip’s first Plugged In partner to offer an API integration that allows their customers to get full Twitter data from Gnip, using the methods and functionality of the new Brandwatch Premium API. Brandwatch’s customers can now apply all the power of Brandwatch Analytics – including their query building tools, custom dashboard visualizations, sentiment, demographics, influence data and more – to reliable, complete access to the Twitter firehose from Gnip. With this first-of-its-kind integration, brands and agencies have the opportunity to get social media analytics from a leading provider together with full Twitter data from Gnip, using one seamless API.

Brandwatch Gnip Premium API Twitter Integration.png

Brands and agencies are increasingly using social data to make business decisions outside the marketing department, and Brandwatch’s new API offering fills an important gap that will make this much easier. We’ve seen an uptick in demand from brands wanting to use social data outside of their social listening services to power CRM applications, to incorporate social data into business intelligence tools to study alongside other business data, and to build custom dashboards that combine social with other important business data. At the same time, these brands often face a challenge. They’ve invested significant time and resources using their social listening services to hone in and find the data that’s most important to them. Additionally, their social listening services provide valuable analytics and additional metadata that brands rely on to help make sense of social data. When it’s come time to consume social data for use in other applications, until now they’ve needed to integrate with Gnip separately. Our new integration with Brandwatch’s Premium API gives their customers a “best of both” solution. It provides a seamless way to combine the powerful social media listening and analytics service they’ve come to rely on with full Twitter data from the world’s most trusted social data provider.

For folks interested in the technical details, the way this works is simple. When Brandwatch customers make API calls for Twitter data, they get routed through a Gnip app that fetches Brandwatch data and merges it with data from Gnip. This means Brandwatch customers can have full assurance that they’re getting licensed Twitter data directly from Gnip.

We know a straightforward, integrated solution like this is something brands have been asking for, and we’re glad it’s finally here. To learn more about how the Brandwatch Premium API works, join our joint webinar next week or contact us at


Putting the Data in Data Discovery – Qliktech & Gnip Partner Up

Gnip is excited to announce that Qliktech is the newest member of our Plugged In partner program. While we partner with many different types of companies – ranging from innovative social analytics products to well-known big data services and software providers – Qliktech is a unique and exciting addition to our program.
Qliktech is discovery software that combines key features of data analysis with intuitive decision-making features, including (to name a few):

  • The ability to consolidate data from multiple sources
  • An easy search function across all datasets and visualizations
  • State-of-the-art graphics for visualization and data discovery
  • Support for social decision-making through secure, real-time collaboration
  • Mobile device data capture and analysis

Our partnership means that joint Qliktech and Gnip clients can easily marry social data with internal datasets to create nuanced visualizations that surface performance indicators and real-time changes that can impact the decisions those clients are making.

To put the powerful capabilities of this new partnership to good use, Gnip will be co-sponsoring a partner hackathon on April 6th at Qonnections– the Qliktech Partner Summit.

Along with HP Vertica and Qliktech, we’ll enable partners to hack on behalf of Medair, Swiss based humanitarian organization that provides support for health, nutrition, water and sanitation, hygiene, and shelter initiatives to countries experiencing natural disasters or emergencies.

A series of recent academic papers have highlighted the usefulness that social media plays in obtaining real-time information following sudden natural disasters. This hackathon will follow in those steps, using Twitter data from during Typhoon Haiyan, which landed in the Philippines on Nov 8th, 2013. Using Gnip’s Profile Geo enhancement, we’ll provide data from the Philippines during that period, allowing other Qliktech partners to experiment with how Medair could leverage this data, within Qliktech, in future situations that require real-time analysis and response.

It will be a great time, but more importantly, will harness the power of the Gnip and Qliktech relationship to accomplish something everyone can be proud of. And that’s a pretty good start to a new partnership!

The History Of Social Data In One Beautiful Timeline

As a company that’s constantly innovating and driving forward, it’s sometimes easy to forget everything that’s led us to where we are today. When Gnip was founded 6 years ago, social data was in its infancy. Twitter produced only 300,000 Tweets per day; social data APIs were either non-existent or unreliable; and nobody had any idea what a selfie was.

Today social data analytics drives decisions in every industry you can imagine, from consumer brands to finance to the public sector to industrial goods. From then to now, there have been dozens of milestones that have helped create the social data industry and we thought it would be fun to highlight and detail all of them in one place.


The story begins humbly in Boulder, Colorado with the concept of changing the way data was gathered from public APIs of social networks. Normally, one would ‘ping’ the API and ask for data, Gnip wanted to reverse that structure (hence our name). In these early days, we focused on simplifying access to existing public APIs but our customers constantly asked us how they could get more and better access to social data. In November of 2010, we were finally able to better meet their needs when we partnered with Twitter to provide access to full Firehose of public Tweets, the first partnership of its kind.

This is when Gnip started to build the tools that have shaped the social data industry. While getting a Firehose of Tweets was great for the industry, the reality was our customers didn’t need 100% of all Tweets, they needed 100% of relevant Tweets. We created PowerTrack to enable sophisticated filtering on the full Firehose of Tweets and solve that problem. We also built valuable enrichments, reliability products, and historical data access to create the most robust Twitter data access available.

While Twitter data was where the industry started, our customers wanted data from other social networks as well. We soon created partnerships with Klout, StockTwits, WordPress, Disqus, Tumblr, Foursquare and others to be the first to bring their data to the market. Our work didn’t end there though. We have been continually adding in new sources, new enrichments, and new products. We also launched the first conference dedicated to social data as well as the first industry organization for social data. Things have come a long way in 6 years and we can’t wait to see the developments in the next 6 years.

Check out our interactive timeline for the full list of milestones and details.


From A to B: Visualizing Language for the Entire History of Twitter

It all started with a simple question: “How could we show the growth and change in languages on Twitter?”

Easy, right?

Well, several months later, here we are; finally ready to show off our final product. You can see a static image of the final viz below and check out the full story and interactive version in The Evolution of Languages on Twitter.

Looking back on the process that led us here, I realized that we’d been through an huge range of ideas and wanted to share that experience with others.

Where Did We Get the Data?

As a data scientist, I walk into Gnip’s vast data playground excited to analyze, visualize and tell stories. For this project, I had access to the full archive of public Tweets that’s part of Gnip’s product offering – that’s every Tweet since the beginning of Twitter in March of 2006.

The next question is: “With this data set, what’s the best way to analyze language?” We had two options here – use Gnip’s language detection or use the language field that’s in every Twitter user’s account settings. Gnip’s language detection enrichment looks at the text of every Tweet and classifies the Tweet as one of 24 different languages. It’s a great enrichment, but for historical data it’s only available back to March 2012.

Since we wanted to tell the story back to the beginning of Twitter, we decided to use the language field that’s in every Twitter user’s account settings.


This field has been part of the Twitter account setup since the beginning, giving us the coverage we need to tell our story.

The First Cut

Having defined how we would determine language, we created our first visualization.



Interesting, but it doesn’t really tell the story we’re looking for.  This visualization tells the story of the growth of Twitter – it grew a lot. The challenge is that this growth obscures the presence of anything other than English, Japanese and Spanish. The sharp rise in volume also makes languages prior to 2010 impossible to see.

So we experimented with rank, language subsets, and other visualization techniques that could tell a broader story. At times, we dabbled in fugly.

Round Two

Moving through insights and iterations, we started to see each Twitter language become its own story. We chose relative rank as an important element and the streams grew into individual banners waving from year end marker poles like flags in the wind.


With this version, we felt like we were getting somewhere…

The Final Version

To get to the final version, we reintroduced the line width as a meaningful element to indicate the percent of Tweet volume, pared down the number of languages to focus the story, and used D3 to spiff up the presentation layer. The end result is a simple visualization that tells the story of how language has grown and changed on Twitter. 

What became clear to me in this process is that visualization is a hugely iterative process and there’s not a single thing that leads to a successful end result. It’s a combination of the questions you ask, how you structure the data, the choices you make in what to show and what not to show and finally the tools you use to display the result.

Let me know what you think…

The Evolution of Languages on Twitter

Revolution. Global economy. Internet access. What story do you see?

This interactive visualization shows the evolution of languages of Tweets according to the language that the user selected in their Twitter profile. The height of the line reflects the percentage of total Tweets and the vertical order is based on rank vs. other languages.

Check it out. Hover over it. See how the languages you care about have changed since the beginning of Twitter.

As you’d expect, Twitter was predominantly English speaking in the early days, but that’s changed as Twitter has grown its adoption globally. English is still the dominant language but with only 51% share in 2013 vs. 79% in 2007. Japanese, Spanish and Portuguese emerged the consistent number two, three and four languages. Beyond that, you can see that relative rankings change dramatically year over year.

In this data, you can see several different stories. The sharp rise in Arabic reflects the impact of the Arab Spring – a series of revolutionary events that made use of Twitter. A spike in Indonesian is indicative of a country with a fast growing online population. Turkish starts to see growth and we expect that growth will continue to spike after the Occupygezi movement. Or step back for a broader view of the timeline; the suggestion of a spread in the globalization of communication networks comes to mind. Each potential story could be a reason to drill down further, expose new ideas and explore the facts.

Adding your own perspective, what story do you see?

(Curious about how we created this viz? Look for our blog post later tomorrow for that story.)

Geo Operators and the Oscars: Using Gnip’s Search API to Identify Tweets Originating from Hollywood’s Dolby Theater

At Gnip, we’ve long believed that the “geo” part of social data is hugely valuable. Geodata adds real-world context to help make sense of online conversations. We’re always looking for ways to make it easier to leverage the geo component of social data, and the geo features in our Search API provide powerful tools to do so. Gnip’s Search API now includes full support for all PowerTrack geo operators, which means analysts, marketers and academics can now get instantaneous answers to their questions about Twitter that require location.

In this post, we’ll walk through exactly what these Search API features look like, and then look at an example from the Oscars of where they add real value. We will find the people actually tweeting from inside Hollywood’s Dolby Theater and do deeper analysis of those Tweets.

What It Does

Gnip’s Search API for Twitter provides the fastest, easiest way to get up and running consuming full-fidelity Twitter data. Users can enter any query using our standard PowerTrack operators and get results instantaneously through a simple request/response API (no streaming required).

Using the Search API, you can run queries that make use of latitude/longitude data, specifically with our bounding_box and point_radius rules. Each rule lets you limit results to content within a box or circle with up to 25-mile sides or radius. These rules can be particularly helpful to cut down on the noise and identify the most relevant content for your query.

These rules can be used independently to filter both types of Twitter geodata available from Gnip:

  1. Twitter’s “geo” value: The native geodata provided by Twitter, based on metadata from mobile devices. This is present for about 2% of Tweets.
  2. Gnip’s “Profile Geo” value: Geocoded locations from users’ profiles, provided by Gnip. This is present for about 30% of Tweets.

An Oscars’ Example: What apps do the glitterati use to Tweet?

There were 17.1 million Oscar-related Tweets this year. Ellen Degeneres’ selfie Tweet at the Oscars was the most retweeted Tweet ever. If you run a search for “arm was longer” during the Oscars, you’ll get a result set that looks like this — over 1.5 million retweets in the first hour following @theellenshow’s Tweet.


The Tweet created a bit of a stir since it was sponsored by Samsung and Ellen used her Galaxy S5 to tweet it, but then used an iPhone backstage shortly afterward. We’ve seen plenty of interest in the past in the iPhone vs. Android debate, so we thought we’d use the opportunity to check what the real “score” was inside the building. What applications were other celebrities using to post to Twitter?

Using the PowerTrack point_radius operator, we created a very narrow search for a 100 meter radius from the center of the Dolby Theater — roughly the size of the building — for the time period from 5:30-11:30 PM PST on Sunday night. It’s a single-clause query:

point_radius:[-118.3409742 34.1021528 0.1km]

The result set that came back included about 250 Tweets from 98 unique users during that narrow time period, and you can see from the spike during the “selfie” moment that it’s a much more refined data set:


Using this geo search parameter, you can quickly identify the people actually at the Oscars during that major Twitter moment.

Among them, we saw the following application usage:

Geotagged Tweets by Application

With all the chatter about the mobile device battles, it’s interesting to see so many Oscar attendees actually posting to Twitter from Instagram versus Twitter mobile clients. Foursquare also made a solid showing in terms of the attendees’ preferred Twitter posting mechanism.

This is just one example of how using Gnip’s geo PowerTrack operators in the Search API can help with doing more powerful social media analysis. For a more detailed overview of all the potential geo-related PowerTrack queries that can be created, check out our support docs here.

P.S. – More on Geo & Twitter Data at #SXSW

If you get excited about social data + geo and will be in Austin this weekend, come check out our session  “Beyond Dots on a Map: Visualizing 3 Billion Tweets” on Sunday at 1 pm. I’ll be on stage with our good friend Eric Gundersen from Mapbox talking about the work we did together to visualize a huge set of Twitter data from around the world.

Gnip's Social Data Picks for SXSW

If you’re one of 30,000 headed to SXSW, we’ve got our social data and data science panel picks for SXSW that you should attend between BBQ and breakfast tacos. Also, if you’re interested in hanging with Gnip, we’ve listed the places we’ll have a presence at!

Also, we’ll be helping put on the Big Boulder: Boots & Bourbon party at SXSW for folks in the social data industry. Send an email to for an invite.


What Social Media Analytics Can’t Tell You
Friday, March 7 at 3:30 PM to 4:30 PM: Sheraton Austin, EFGH

Great panel with Vision Critical, Crowd Companies and more. “Whether you’re looking for fresh insight on what makes social media users tick, or trying to expand your own monitoring and analytics program, this session will give you a first look at the latest data and research methods.”

Book Signing – John Foreman, Chief Data Scientist at MailChimp
Friday, March 7 at 3:50 to 4:10 PM: Austin Convention Center, Ballroom D Foyer

During an interview with Gnip, John said that the data challenge he’d most like to solve is the Taco Bell menu. You should definitely get his book and get it signed.


Truth Will Set You Free but Data Will Piss You Off
Saturday, March 8 from 3:30 to 4:30 PM: Sheraton Austin Creekside

All-star speakers from DataKind, Periscopic and more talking about “the issues and ethics around data visualization–a subject of recent debate in the data visualization community–and suggest how we can use data in tandem with social responsibility.”

Keeping Score in Social: It’s More than Likes
Saturday, March 8 from 5:15 to 5:30 PM: Austin Convention Center, Ballroom F

Jim Rudden, the CMO of Spredfast, brands will talk about “what it takes to move beyond measuring likes to measuring real social impact.”


Mentor Session: Emi Hofmeister
Sunday, March 9 at 11 AM to 12 PM: Hilton Garden Inn, 10th Floor Atrium

Meet with Emi Hofmeister, the senior product marketing manager at Adobe Social. All sessions appear to be booked but keep an eye out for cancellations. Sign up here:

The Science of Predicting Earned Media
Sunday, March 9 at 12:30 to 1:30 PM: Sheraton Austin, EFGH

“In this panel session, renowned video advertising expert Brian Shin, Founder and CEO at Visible Measures, Seraj Bharwani, Chief Analytics Officer at Visible Measures, along with Kate Sirkin, Executive Vice President, Global Research at Starcom MediaVest Group, will go through the models built to quantify the impact of earned media, so that brands can not only plan for it, but optimize and repeat it.”

GNIP EVENT: Beyond Dots on a Map: Visualizing 3 Billion Tweets
Sunday, March 9 at 1:00-1:15 PM: Austin Convention Center, Ballroom E

Gnip’s product manager, Ian Cairns, will be speaking about the massive Twitter visualization Mapbox and Gnip created and what 3 billion geotagged Tweets can tell us.

Mentor Session: Jenn Deering Davis
Sunday, March 9 at 5 to 6 PM: Hilton Garden Inn, 10th Floor Atrium

SIgn up for a mentoring session with Jenn Deering Davis, the co-founder of Union Metrics. Sign up here –

Algorithms, Journalism & Democracy
Sunday, March 9 from 5 to 6 PM: Austin Convention Center, Room 12AB

Read our interview with Gilad Lotan of betaworks on his SXSW session and data science. Gilad will be joined by Kelly McBride of the Poynter Institute about the ways algorithms are biased in ways that we might not think about it. “Understanding how algorithms control and manipulate your world is key to becoming truly literate in today’s world.”


Scientist to Storyteller: How to Narrate Data
Monday, March 10 at 12:30 – 1:30 PM: Four Seasons Ballroom

See our interview with Eric Swayne about this SXSW session and data narration. On the session, “We will understand what a data-driven insight truly IS, and how we can help organizations not only understand it, but act on it.”

#Occupygezi Movement: A Turkish Twitter Revolution
Monday, March 10 at 12:30 – 1:30 PM:  Austin Convention Center, Room 5ABC

See our interview with Yalçin Pembeciogli about how the Occupygezi movement was affected by the use of Twitter. “We hope to show you the social and political side of the movements and explain how social media enabled this movement to be organic and leaderless with many cases and stories.”

GNIP EVENT: Dive Into Social Media Analytics
Monday, March 10 at 3:30 – 4:30 PM: Hilton Austin Downtown, Salon B

Gnip’s VP of Product, Rob Johnson, will be speaking alongside IBM about “how startups can push the boundaries of what is possible by capturing and analyzing data and using the insights gained to transform the business while blowing away the competition.”


Measure This; Change the World
Tuesday, March 11 at 11 AM to 12 PM: Sheraton Austin, EFGH

A panel with folks from Intel, Cornell, Knowable Research, etc. looking at what we can learn from social scientists and how they measure vs how marketers measure.

Make Love with Your Data
Tuesday, March 11 at 3:30 to 4:30 PM: Sheraton Austin, Capitol ABCD

This session is from the founder of OkCupid, Christian Rudder. I interviewed Christian previously and am a big fan. “We’ll interweave the story of our company with the story of our users, and by the end you will leave with a better understanding of not just OkCupid and data, but of human nature.”