There’s always a lot going on here at Gnip, but this week is especially packed with the team looking to make a big splash at Salesforce.com’s annual Dreamforce event. Salesforce is obviously a huge player in the software space and the theme of this year’s Dreamforce is “Welcome to the Social Enterprise” which fits really nicely with what we do.
At the conference, we’ll be speaking at two sessions and sponsoring the Hack-a-thon. In the first presentation, Drinking from the Firehose: How Social Data is Changing Business Practices, Jud (@jvaleski) and Chris (@chrismoodycom) will discuss the ways that social data is being used to drive innovation across a variety of industries from Financial Services and Emergency Response to Local Business and Consumer Electronics. They’ll also give a glimpse into the technical challenges involved in handling the ever-increasing volume of data that’s flowing out of Twitter every day. If you’re at Dreamforce, this session is on Tuesday (8/30) from 11am to noon in the DevZone Theater on the 2nd floor of Moscone West.
In the second presentation, Your Guide to Understanding the Twitter API, Rob (@robjohnson) will talk through the best ways to get access to the Twitter data that you’re looking for, examining the pros and cons of the various methods. You can check out Rob’s session on Tuesday (8/30) from 3:00 to 3:30 in the Lightning Forum in the DevZone on the 2nd floor of Moscone West.
And finally, we’re sponsoring the Hack-a-thon where teams of developers will create cloud apps for the social enterprise using Twitter feeds from Gnip and at least one of the Salesforce platforms (Force.com, Heroku, Database.com). The winning team stands to take home at least $10,000 in prize money. We’re really excited to see the creative solutions that the teams develop! All submissions are due no later than 6am on Thursday (9/1), so sign up now and get going!
Want to meet up in person at Dreamforce? Give any of us a shout @jvaleski, @chrismoodycom, @robjohnson, @funkefred.
Like many startups seeking to enter and capitalize on the rising social media marketplace, timing is everything. MutualMind was no exception: getting their enterprise social media management product to market in a timely manner was crucial to the success of their business. MutualMind provides an enterprise social media intelligence and management system that monitors, analyzes, and promotes brands on social networks and helps increase social media ROI. The platform enables customers to listen to discussion on the social web, gauge sentiment, track competitors, identify and engage with influencers, and use resulting insights to improve their overall brand strategy.
“Through their social media API, Gnip helped us push our product to market six months ahead of schedule, enabling us to capitalize on the social media intelligence space. This allowed MutualMind to focus on the core value it adds by providing advanced analytics, seamless engagement, and enterprise-grade social management capabilities.”
- Babar Bhatti
By selecting Gnip as their data delivery partner, MutualMind was able to get their product to market six months ahead of schedule. Today, MutualMind processes tens of millions of data activities per month using multiple sources from Gnip including premium Twitter data, YouTube, Flickr, and more.
Get the full detail, read the success story here.
The second type of data stream is “filtered streams.” Filtered streams deliver all the Tweets that match a filter you select (eg. keywords, usernames, or geographical boundaries). This can be very useful for developers or businesses that need limited access to specific Tweets.
Because the Streaming API is not designed for enterprise access, however, Twitter imposes some restrictions on its filtered streams that are important to understand. First, the volume of Tweets accessible through these streams is limited so that it will never exceed a certain percentage of the full Firehose. (This percentage is not publicly shared by Twitter.) As a result, only low-volume queries can reliably be accommodated. Second, Twitter imposes a query limit: currently, users can query for a maximum of 400 keywords and only a limited number of usernames. This is a significant challenge for many businesses. Third, Boolean operators are not supported by the Streaming API like they are by the Search API (and by Gnip’s API). And finally, there is no guarantee that Twitter’s access levels will remain unchanged in the future. Enterprises that need guaranteed access to data over time should understand that building a business on any free, public APIs can be risky.
The Search API and Streaming API are great ways to gather a sampling of social media data from Twitter. We’re clearly fans over here at Gnip; we actually offer Search API access through our Enterprise Data Collector. And here’s one more cool benefit of using Twitter’s free public APIs: those APIs don’t prohibit display of the Tweets you receive to the general public like premium Twitter feeds from Gnip and other resyndication partners do.
If your business dictates a need for full coverage data, more complex queries, an agreement that ensures continued access to data over time, or enterprise-level customer support, then we recommend getting in touch with a premium social media data provider like Gnip. Our complementary premium Twitter products include Power Track for data filtered by keyword or other parameters, and Decahose and Halfhose for randomly sampled data streams (10% and 50%, respectively). If you’d like to learn more, we’d love to hear from you at firstname.lastname@example.org or 888.777.7405.
So . . . can you just make requests for results more frequently? Well, yes, you can, but the total number or requests you’re allowed to make per unit time is constrained by Twitter’s rate limits. Some queries are so popular (hello “Justin Bieber”) that it can be impossible to make enough requests to Twitter for that query alone to keep up with this stream. And this is only the beginning of the problem as no monitoring or analytics vendor is interested in just one term; many have hundreds to thousands of brands or products to monitor.
Let’s consider a couple examples to clarify. First, say you want all Tweets mentioning “Coca Cola” and only that one term. There might be fewer than 100 matching Tweets per second usually — but if there’s a spike (say that term becomes a trending topic after a Super Bowl commercial), then there will likely be more than 100 per second. If because of Twitter’s rate limits, you’re only allowed to send one request per second, you will have missed some of the Tweets generated at the most critical moment of all.
Now, let’s be realistic: you’re probably not tracking just one term. Most of our customers are interested in tracking somewhere between dozens and hundreds of thousands of terms. If you add 999 more terms to your list, then you’ll only be checking for Tweets matching “Coca Cola” once every 1,000 seconds. And in 1,000 seconds, there could easily be more than 100 Tweets mentioning your keyword, even on an average day. (Keep in mind that there are over a billion Tweets per week nowadays.) So, in this scenario, you could easily miss Tweets if you’re using the Twitter Search API. It’s also worth bearing in mind that the Tweets you do receive won’t arrive in realtime because you’re only querying for the Tweets every 1,000 seconds.
Because of these issues related to the monitoring use cases, data collection strategies relying exclusively on the Search API will frequently deliver poor coverage of Twitter data. Also, be forewarned, if you are working with a monitoring or analytics vendor who claims full Twitter coverage but is using the Search API exclusively, you’re being misled.
Although coverage is not complete, one great thing about the Twitter Search API is the complex operator capabilities it supports, such as Boolean queries and geo filtering. Although the coverage is limited, some people opt to use the Search API to collect a sampling of Tweets that match their search terms because it supports Boolean operators and geo parameters. Because these filtering features have been so well liked, Gnip has replicated many of them in our own premium Twitter API (made even more powerful by the full coverage and unique data enrichments we offer).
So, to recap, the Twitter Search API offers great operator support but you should know that you’ll generally only see a portion of the total Tweets that match your keywords and your data might arrive with some delay. To simplify access to the Twitter Search API, consider trying out Gnip’s Enterprise Data Collector; our “Keyword Notices” feed retrieves, normalizes, and deduplicates data delivered through the Search API. We can also stream it to you so you don’t have to poll for your results. (“Gnip” reverses the “ping,” get it?)
But the only way to ensure you receive full coverage of Tweets that match your filtering criteria is to work with a premium data provider (like us! blush…) for full coverage Twitter firehose filtering. (See our Power Track feed if you’d like for more info on that.)
Stay tuned for Part 3, our overview of Twitter’s Streaming API coming next week…
A little over two years ago, Jud and I hatched an audacious plan — pair a deep data guy with a consumer guy to launch an enterprise company. We would build an incredible data service with the polish of a consumer app, then attack a market generally known for being rather dull with a combination of substance and style.
Over the last two years, Jud has done an amazing job serving as Gnip’s CTO and implicitly as VP of Engineering. Under his leadership, the engineering team has delivered a product that turns the process of integrating with dozens of diverse APIs into a push-button experience. The team he assembled is fantastically talented and passionate about making real-time data more easily consumed. My own team has performed equally well, adding much-needed process to Gnip’s sales and marketing.
Two years ago, if you asked Corporate America to define “social media,” they probably would have said “the blogs.” Last year, they would have probably answered “the blogs and Twitter” and this year they’re adding Facebook to their collective consciousness. The time is better than ever to bring Gnip’s platform to the enterprise and, ultimately, I’m not the CEO to do it. Our plan to have a consumer guy lead an enterprise company ended up having a few holes. For Gnip to thrive in the enterprise, it needs to be squarely in the hands of people who have previously succeeded in that space. So as of today, I’m stepping down as CEO and leaving the company. Jud is taking over as CEO.
I am honored to have worked with Jud and it has been a privilege to work with my team for the last two years. Anything that Gnip has accomplished so far has been because of them. Any criticisms that the company could have accomplished more in the last two years can be directed squarely at me. I look forward to seeing Jud and the team do great things in the years ahead.
Gnip has offered the same basic licensing options since we launched the 2.0 version of the platform last September. During that year we have learned a lot about how companies and individual developers use the Gnip platform to discover, access, integrate and filter social and business data for their applications. In that time the daily volume of activities flowing across the platform has grown from thousands of activities across a handful of services to 100 to 150 million activities in a given day across almost forty different data sources.
Gnip Platform License Updates: In the second half of August Gnip will introduce several changes to our licensing options that will impact existing users and new users
- Gnip will be provide several licensing options for the Standard Edition service
- Commercial license: This is the default license for all commercial uses of the Gnip Platform
- Non-profit license: This option will be available to companies and organizations with an appropriate 501(c) status
- Startup Partner license: This option is available to companies that meet the qualification terms of the partner program.
- Trial license: This option will be the default experience for new users and provide 30 days to evaluate the platform.
- The Community Edition of the Gnip Platform will being retired since we discovered over the last year that the TOS for the Community Edition made the option a poor fit for real-world company use cases. We believe any small company using the Gnip Platform Community Edition should be able to move to our Startup Partner Program.
Impact on existing and new users: The most obvious change for new users that sign up after these licensing updates is that their accounts will be active for 30 days. All existing users on the Gnip Platform will have their existing accounts convert to a 30 day trial account when the new licensing is rolled out during the second half of August.
Planning for the licensing updates: If your company meets any of the regular license options please contact us at email@example.com or firstname.lastname@example.org to discuss moving to the Commercial, Non-profit or Startup Partner licenses.
We are pleased to announce an early access program for a new Gnip data publisher to access and integrate data from the Facebook Platform Open Streams API.
- Choose the specific Facebook users from among those that have authorized your applications and then Gnip will immediately begin collecting the relevant data, normalize it and deliver it in real-time to your applications.
Developers and companies can sign up right now to be notified when the early access program is launched by sending an email to email@example.com with the subject: Facebook. Any company signing up for the early access program will be eligible for three free months subscription service to the Gnip data publisher for the Facebook Platform once it is generally released. At this time the early access program is planned to be launched in the summer.
And to provide a small taste of the upcoming integration here are two examples of what common Newsfeed actions on Facebook will look like when accessed via the planned Gnip data publisher.
1) Status update Example (fbids in this example were changed from actual one in my stream item)
<actor metaURL=”http://www.facebook.com/people/Shane-Pearson/12345″>Shane Pearson</actor>
<body>It must be spring as my weekly trip to Lowes/Home Depot is back on the schedule</body>
2) Upload photo example (the below Gnip data schema maps to a Facebook activity stream example)
<actor metaURL=”http://www.facebook.com/people/Snapshot-Smith/499225643″>Snapshot Smith</actor>
<title>Snapshot Smith uploaded a photo.</title>
<body><p><a href=”http://www.facebook.com/photo.php?pid=28&id=499225643&ref=at” caption=”A very attractive wall, indeed”/></a></p>
<mediaURL type=”thumbnail” > http://photos-e.ak.fbcdn.net/photos-ak-snc1/v2692/195/117/499225643/s499225643_28_6861716.jpg</mediaURL>
<mediaURL type=”content” > http://www.facebook.com/photo.php?pid=28&id=499225643&ref=at<</mediaURL>