Streaming Data Just Got Easier: Announcing Gnip’s New Connector for Amazon Kinesis

I’m happy to announce a new solution we’ve built to make it simple to get massive amounts of social data into the AWS cloud environment. I’m here in London for the AWS Summit where Stephen E. Schmidt, Vice President of Amazon Web Services, just announced that Gnip’s new Kinesis Connector is available as a free AMI starting today in the AWS Marketplace. This new application takes care of ingesting streaming social data from Gnip into Amazon Kinesis. Spinning up a new instance of the Gnip Kinesis Connector takes about five minutes, and once you’re done, you can focus on writing your own applications that make use of social data instead of spending time writing code to consume it.

 

AWS_Logo_PoweredBy_300px

 

Amazon Kinesis is AWS’s managed service for processing streaming data. It has its own client libraries that enable developers to build streaming data processing applications and get data into AWS services like Amazon DynamoDB, Amazon S3 and Amazon Redshift for use in analytics and business intelligence applications. You can read an in-depth description of Amazon Kinesis and its benefits on the AWS blog.

We were excited when Amazon Kinesis launched last November because it helps solve key challenges that we know our customers face. At Gnip, we understand the challenges of streaming massive amounts of data much better than most. Some of the biggest hurdles – especially for high-volume streams – include maintaining a consistent connection, recovering data after a dropped connection, and keeping up with reading from a stream during large spikes of inbound data. The combination of Gnip’s Kinesis Connector and Amazon Kinesis provides a “best practice” solution for social data integration with Gnip’s streaming APIs that helps address all of these hurdles.

Gnip’s Kinesis Connector and the high-availability Amazon AWS environment provide a seamless “out-of-the-box” solution to maintain full fidelity data without worrying about HTTP streaming connections. If and when connections do drop (it’s impossible to maintain an HTTP streaming connection forever), Gnip’s Kinesis Connector automatically reconnects as quickly as possible and uses Gnip’s Backfill feature to ingest data you would have otherwise missed. And due to the durable nature of data in Amazon Kinesis, you can pick right back up where you left off reading from Amazon Kinesis if your consumer application needs to restart.

In addition to these features, one of the biggest benefits of Amazon Kinesis is its low cost. To give you a sense for what that low cost looks like, a Twitter Decahose stream delivers about 50MM messages in a day. Between Amazon Kinesis shard costs and HTTP PUT costs, it would cost about $2.12 per day to put all this data into Amazon Kinesis (plus Amazon EC2 costs for the instance).

Gnip’s Kinesis Connector is ready to use starting today for any Twitter PowerTrack or Decahose stream. We’re excited about the many new, different applications this will make possible for our customers. We hope you’ll take it for a test drive and share feedback with us about how it helps you and your business do more with social data.

Gnip and Amazon AWS