Twitter, you've come a long way baby… #8years

Like a child’s first steps or your first experiment with pop rocks candy, the first ever Tweet went down in the Internet history books eight years ago today. On March 21st, 2006, Jack Dorsey, co-founder of Twitter published this.

 

Twttr, (the service’s first name), was launched to the public on July 15, 2006 where it was recognized for “good execution on a simple but viral idea.” Eight years later, that seems to have held true.

It has become the digital watering hole, the newsroom, the customer service do’s and don’ts, a place to store your witty jargon that would just be weird to say openly at your desk. And then there is that overly happy person you thought couldn’t actually exist, standing in front of you in line, and you just favorited their selfie #blessed. Well, this is awkward.

Just eight months after their release, the company made a sweeping entrance into SXSW 2007 sparking the platforms usage to balloon from 20,000 to 60,000 Tweets per day. Thus beginning the era of our public everyday lives being archived in 140 character tidbits. The manual “RT” turned into a click of a button, and favorites became the digital head nod. I see you.

In April 2009, Twitter launched the Trending Topics sidebar, identifying popular current world events and modish hashtags. Verified accounts became available that summer; Athletes, actors, and icons alike began to display the “verified account” tag on their Twitter pages. This increasingly became a necessity in recognizing the real Miley Cyrus vs. Justin Bieber. If differences do exist.

The Twitter Firehose launched in March 2010. By giving Gnip access, a new door had opened into the social data industry and come November, filtered access to social data was born. Twitter turned to Gnip to be their first partner serving the commercial market. By offering complete access to the full firehose of publicly-available Tweets under enterprise terms, this partnership enabled companies to build more advanced analytics solutions with the knowledge that they would have ongoing access to the underlying data. This was a key inflection point in the growth of the social data ecosystem. By April, Gnip played a key role in the delivering past and future Twitter data to the Library of Congress for historic preservation in the archives.

July 31, 2010, Twitter hit their 20 billionth Tweet milestone, or as we like to call it, twilestone. It is the platform of hashtags and Retweets, celebrities and nobodies, at-replies, political rants, entertainment 411 and “pics or it didn’t happen.” By June 1st, 2011, Twitter allowed just that as it broke into the photo sharing space, allowing users to upload their photos straight to their personal handle.

One of the most highly requested features was the ability to get historical Tweets. In March 2012, Gnip delivered just that by making every public Tweet available starting from March 21, 2006 by Mr. Dorsey himself.

Fast forward 8 years, Twitter is reporting over 500 million Tweets per day. That’s more than 25,000 times the amount of Tweets-per-day in just 8 years! With over 2 billion accounts, over a quarter of the world’s population, Twitter ranks high among the top websites visited everyday. Here’s to the times where we write our Twitter handle on our conference name tags instead of our birth names, and prefer to be tweeted at than texted. Voicemails? Ain’t nobody got time for that.

Twitter launched a special surprise for its 8th birthday. Want to check out your first tweet?

“There’s a #FirstTweet for everything.” Happy Anniversary!

 

See more memorable Twitter milestones

What's In The [Social] Data?

I was reading about Higgs Boson this morning and came across this cartoon explanation of matter, particle acceleration, and Higgs Boson. The very last pic in the cartoon (below) reminded me of the cartoon that’s been in my head for years; the one that pops into my head when I go to work. It drives what we do at Gnip. All of our energy is focused on helping our customers answer that question; “What’s in the data?” We do this by reliably collecting, filtering, enriching, and delivering billions of public social activities (social data) to our customers with business critical data needs, everyday.

What's In the Data

What's In The Data slide from PHD Comics - http://www.phdcomics.com/comics.php?f=1489

Tebow Time Scores in Social Media

Wow, what a game!  If you missed the instant classic that was the Broncos/Steelers overtime game tonight, check out the recap.

When Tim Tebow connected with Demaryius Thomas on an 80-yard touchdown pass on the first play of overtime, we saw a noticeable spike in the overall volume of social media messages flowing through the Gnip platform.

Tebow Time!

Spike in Social Media Mentions when Tim Tebow Throws Winning Touchdown against the Steelers

Gnip Cagefight #2: Pumpkin Pie vs. Pecan Pie

Thanksgiving is a time for family gatherings, turkey with all the delicious fixings, football, and let’s not forget, pie! If your family is anything like mine, multiple pie flavors are required to satisfy the differing palates and strong opinions. So we wondered, which pies are people discussing for the holiday? What better way to celebrate and answer that question than with a Gnip Cagefight.

Welcome to the Battle of the Pies!

For those of you that have been in a pie eating contest or had a pie in the face, you know this one will be a fight all the way down to the very last crumb. In one corner (well actually it is the Gnip Octagon so can you really have corners, oh well) we have The Traditionalist, pumpkin pie and in the opposite corner, The New Comer, pecan pie. Without further ado, Ladies and Gentleman, Let’s Get Ready to Rumble, wait wrong sport. Let’s Fight!

Six Social Media Sources, Two Words, One Winner . . . And the Winner Is . . .

 

 Source  Pumpkin Pie  Pecan Pie  Winning Ratio
Pumpkin Pie to Pecan Pie
Twitter X 4:1
Facebook X 5:1
Google+ X 6:1
Newsgator X 3:1
WordPress X 5:1
WordPress Comments X 2:1
Overall +6 Winner! +0 :(

 

We looked at one week’s worth of data across six of the top social media sources and determined that pumpkin pie “takes the cake” (so to speak) across every source.

In this case, it is interesting to point out that in sources like Twitter, Facebook, Google+ and WordPress we see higher winning ratios, while sources that tend to have higher latency such as Newsgator and WordPress Comments were a little more even. Is this because, on further consideration, pecan pie sounds pretty good? Or is it that everyone will have to have two pies and, with pecan as the traditional second, it is highly discussed?

Top Pie Recipes

Even though pumpkin pie was our clear winner, we thought it would be fun to share a few of the most popular holiday pie recipes by social media source:

  1. Twitter – Cook du Jour Gluten-Free Pumpkin Pie and Pecan Pie Video Recipe from joyofcooking.com
  2. Facebook – Ben Starr’s Pumpkin Bourbon Pecan Pie Recipe
  3. Newsgator – BlogHer’s Pumpkin Pecan Roulade with Orange Mascarpone Cream Pie Recipe
  4. WordPress and WordPress Comments – Chocolate Bourbon Pecan Pie from allrecipes.com

Non-Traditional Thanksgiving Pies

Another interesting fact that came out of this Cagefight was the counts of non-traditional Thanksgiving pies that were mentioned across the social media sources we surveyed. Though we rarely find these useful for communicating numerical values effectively, you can’t not have a pie chart in this post.

Happy Thanksgiving!

Gnip Cagefight #1: Beer vs. Wine

Welcome to the very first edition of the Gnip Cagefight! Over the next couple of weeks we’ll select a common word pair to enter the Gnip Octagon to fight to the finish in a no holds barred battle of Tweets. Two words will enter. Only one will leave.

In addition to crowning the victor, we’ll also call out some of the fun, interesting, strange, and bizarre trends that we glean from the data. Leave us a comment with any contenders you’d like to see in the future.

Now without further delay, let’s dive into our first Gnip Cagefight… Put your hands together for Wine vs. Beer!

And the Winner is . . .

We looked at one week of Tweets that contained the words “beer” or “wine,” and beer was the more commonly used term, appearing in 53.1% of those tweets vs. 48.1% for wine. Now you might be saying, “Hey, that’s more than 100%!” You are correct! That’s because beer and wine appear together about 13,801 times–along with an uncomfortable hangover, we presume. (Is this an opportunity to sell aspirin?)

With beer as our victor, we wanted to answer the age old question . . .

What time is Beer Thirty?

To answer this question, we analyzed the volume of Tweets containing the term “beer” throughout each day and averaged that across the week’s worth of data we collected. Each Tweet’s time was moved into the time zone of the Tweeter and normalized against the daily cycle of Tweet volume. Based on the graph below, true beer thirty is 5pm local time. This gives great meaning to the saying “It’s 5 o’clock somewhere.”

Beer Drinkers have a Wider Vocabulary than Wine Drinkers

Another fascinating tidbit that came out of the data was that beer drinkers have a wider vocabulary than wine drinkers. Normalizing for the number of words used, we find that beer drinkers use 14% more distinct words than wine drinkers. Wine drinkers tend to use the same idioms, for example, “glass of wine” or “red wine,” more than beer drinkers use their most common phrases. Does this mean that beer drinkers are 14% smarter than wine drinkers? Or that they use very creative spelling? We won’t wade any further into that question, but you can be the judge.

That’s all for our inaugural Gnip Cagefight. Hope you enjoyed it and be sure to let us know what what words you’d like to see in the octagon in the future.

The VMAs, Lady Gaga and Data Science

Hi everyone. I’m the new Data Scientist here at Gnip. I’ll be analyzing the fascinating data that we have coming from all of our varied social data streams to pull out the stories, both impactful and trivial, that are flowing through social media conversations. I’m still getting up-to-speed but wanted to share one of the first social events that I’ve dug into, the 2011 MTV Video Music Awards.

Check out the info below and let me know in the comments what you think and what you’d like to see more of.  And now, on with the show…

3.6M Tweets Mention “VMA”

The volume of tweets containing “VMA” rose steadily from a few hours before the VMA pre-show was broadcast, up to the starting of the pre-show at 8:00 PM ET (00:00 GMT) and remained fairly strong during the event. It trailed to low volume within the hour after the VMA broadcast ended at 11:15 PM ET (03:15 GMT). Tweets mentioning “VMA” totaled 3.6M during the 7 hours surrounding and including the VMA broadcast.

 

Lady Gaga Steals the “Tweet” Show

The largest volume of tweets for an individual artist are the mentions of “gaga.” Lady Gaga performed early in the show and the surge of tweets during her performance surpassed 35k tweets per minute for about 8 minutes. Again in the second half, Lady Gaga tweet volume briefly jumped above 50k per minute. Tweets mentioning “gaga” totaled 1.8M during the 7 hours surrounding and including the VMA broadcast.

As you can see in the chart below, other artists that garnered significant tweet volumes included Beyonce’, Justin Beiber, Chris Brown, Katy Perry and Kanye West. Perry, West and Brown got a lot of attention during their appearances, while Justin Bieber and Lady Gaga lead the counts in volume by maintaining a fairly steady stream of tweets during the broadcast.

Term Representation of Tweets Sampled
VMA 44 %
Lady Gaga 21 %
Beyonce 16 %
Justin Bieber 10 %
MTV 9.2 %
Chris Brown 8.0 %
Katy Perry 5.6 %
Kanye West 4.8 %
Jonas 3.5 %
Taylor Swift 2.1 %
Rihanna 1.1 %
Eminem 0.55 %
Michael Jackson 0.18 %
Ke$ha 0.17 %
Cher 0.14 %
Paramore 0.12 %

 

 

 

Contrasting, it is interesting to note that Beyonce’ and Chris Brown gained most of their tweet attention around their performances with very larger surges in tweet volume. Beyonce’s volume–another Beyonce’ bump–continues after her performance as twitter users absorb the news of her pregnancy.

 

 

One surprise that emerges from looking for other artists connected to the VMAs was Michael Jackson’s tweet volume. While Jackson gleaned many Retweets after winning the King of the VMA poll, he also received a large number of natural tweets lamenting his passing and celebrating his past successes.

Methodology

The free-form text and limited length of twitter messages creates a number of challenges for monitoring an event via twitter comments. People refer to the event differently and focus on different parts of the event. There will be spelling variations and differences in idioms and nicknames used to describe people and performances. Do we search for “Bieber”,”Beiber” and “Justin”?  Will tweeters use “Beyonce” or Beyonce'”? Knowledge of what we are monitoring is required; preparing tools to adapt things we learn during the events is also essential to getting good results.

One effective strategy is to use one or two tokens to identify tweets related to the event. The objective is to choose terms that we know are related to the event, that won’t be widely used outside the event, and that will give a representative sample–diverse and with sufficient volume. Once we have started to collect the event-focused twitter sample, we can look for relevant terms correlated with the filter term to find out what else people are tweeting about during the event.

Hope you enjoyed this first post. Look for more to come.

 

We’ve Got Scooters, Yes We Do! We’ve Got Scooters, How About You?

The competition for talent in Boulder is heating up. Yes folks, Gnip is hiring. We are seeking a Marketing Director to lead Gnip’s marketing operations and activities, working closely with our sales team and COO to drive brand and industry recognition as well as product adoption.

As if the fantastic start-up environment, great team, downtown Boulder location, breakfast at work daily (be sure to follow us @breakfastatgnip), tabs at nearby coffee shops, Eldora ski passes, and Boulder recreation center passes weren’t enough, we are now offering a Honda Metropolitan Scooter as a signing bonus for the Marketing Director position! What more could you ask for?

So if you think you have what it takes to be part of our awesome team, we would love to hear from you! Also, be sure to check out our other open positions including; Sales Engineer, Sales Executive, Senior Software Engineer, Social Media Data AnalystSoftware Engineer, and two Customer Support Engineers.

 

Gnip. The Story Behind the Name

Have you ever thought “Gnip”. . . well that is a strange name for a company, what does it mean? As one of the newest members of the Gnip team I found myself thinking that very same thing. And as I began telling my friends about this amazing new start-up that I was going to be working for in Boulder, Colorado they too began to inquire as to the meaning behind the name.

Gnip, pronounced (guh’nip), got its name from the very heart of what we do, realtime social media data collection and delivery. So let’s dive in to . . .

Data Collection 101

There are two general methods for data collection, pull technology and push technology. Pull technology is best described as a data transfer in which the request is initiated by the data consumer and responded to by the data publisher’s server. In contrast, push technology refers to the request being initiated by the data publisher’s server and sent to the data consumer.

So why does this matter . . .

Well most social media publishers use the pull method. This means that the data consumer’s system must constantly go out and “ping” the data publisher’s server asking, “do you have any new data now?” . . . “how about now?” . . . “and now?” And this can cause a few issues:

  1. Deduplication – If you ping the social media server one second and then ping it again a second later and there were no new results, you will receive the same results you got one second ago. This would then require deduplication of the data.
  2. Rate Limiting – every social media data publisher’s server out there sets different rate limits, a limit used to control the number of times you can ping a server in a given time frame. These rate limits are constantly changing and typically don’t get published. As such, if your server is set to ping the publisher’s server above the rate limit, it could potentially result in complete shut down of your data collection, leaving you to determine why the connection is broken (Is it the API . . . Is it the rate limit . . . What is the rate limit)?

So as you can see, pull technology can be a tricky beast.

Enter Gnip

Gnip sought to provide our customers with the option: to receive data in either the push model or the pull model, regardless of the native delivery from the data publisher’s server. In other words we wanted to reverse the “ping” process for our customers. Hence, we reversed the word “ping” to get the name Gnip. And there you have it, the story behind the name!

30 Social Data Applications that Will Change Our World

Social media is popular — no surprise there. And as a result, there’s a huge amount of social media data in the world and every day the pool of data grows… not just a little bit, but enormously. For instance, just recently our partner Twitter blogged about their business growth and the numbers are staggering.

This social conversation data is valuable. Someday it will yield insights worth many millions, perhaps billions, of dollars for businesses. But the analyses and insights are only barely beginning to take shape. We hear from social media analytics companies every day and we see lots of interesting applications of this data. So… how can social media data be used? Here’s a partial list of social data applications that I believe will begin to take shape over the next decade or so:

  1. Product development direction
  2. Product feedback
  3. Customer service performance feedback
  4. Customer communications
  5. Stock market prediction
  6. Domestic/political mood analysis
  7. Societal trend analysis
  8. Offline marketing campaign impact measurement
  9. Word-of-mouth marketing campaign analysis
  10. URL virality analysis
  11. News virality analysis
  12. Domestic economic health indicator
  13. Linguistic analysis
  14. Educational achievement metric by time and locale
  15. Personal scheduling: see when your friends are busy
  16. Event planning: see when big events will happen in your community
  17. Online marketing
  18. Sales mapping & identification
  19. Consumer behavior analysis
  20. Internet safety implementation
  21. Counter-terrorism probabilistic analysis
  22. Disaster relief communication, mapping, and analysis
  23. Product development opportunity identification
  24. Competitive analysis
  25. Recruiting tools
  26. Connector, Maven, and Salesperson identification (to borrow Malcolm Gladwell’s terms)
  27. Cross-platform consumer alerting services
  28. Brand monitoring
  29. Business accountability ratings
  30. Product and service reviews

All of these projects can be built on public social media conversation data that’s legally and practically accessible. All of the necessary data is (or is on the roadmap to be) accessible via Gnip. But access to the data is only step one — the next step is building great algorithms and applications to draw insights from that data. We leave that part to our customers.

So, here’s to the analysts who are working with huge social data sets to bring social data analyses and insights to fruition and ultimately make the barrage of public data that surrounds us increasingly useful. Here at Gnip we’re grateful for your efforts and eager to find out what you learn.

Our Poem for Mountain.rb

Hello and Greetings, Our Ruby Dev Friends,
Mountain.rb we were pleased to attend.

Perhaps we did meet you! Perhaps we did not.
We hope, either way, you’ll give our tools a shot.

What do we do? Manage API feeds.
We fight the rate limits, dedupe all those tweets.

Need to know where those bit.ly’s point to?
Want to choose polling or streaming, do you?

We do those things, and on top of all that,
We put all your results in just one format.

You write only one parser for all of our feeds.
(We’ve got over 100 to meet your needs.)

The Facebook, The Twitter, The YouTube and More
If mass data collection makes your head sore…

Do not curse publishers, don’t make a fuss.
Just go to the Internet and visit us.

We’re not the best poets. Data’s more our thing.
So when you face APIs… give us a ring.