A Little Bit of Arithmetic
155,000,000 Tweets/day 2,500 Bytes = 387,500,000,000 Bytes/day
387,500,000,000 Bytes/day 24 Hours = 16,145,833,333 Bytes/hour
16,145,833,333 Bytes/hour 60 minutes = 269,097,222 Bytes/minute
269,097,222 Bytes/minute 60 second = 4,484,953 Bytes/second
4,484,953 Bytes/second 1,048,576 Bytes/megabyte = 4.2 Megabytes/second
And in terms of data transfer rates . . .
1 Megabyte/second = 8 Megabits/second
So . . .
4.2 Megabytes/second 8 Megabits/Megabyte = 33.8 Megabits/second
That’s a Lot of Data
So what does this mean for the data consumers, the companies wanting to reevaluate their traditional business models to take advantage of vast amounts of Twitter data? At Gnip we’ve learned that some of the collective industry data processing tools simply don’t work at this scale: out-of-the-box HTTP servers/configs aren’t sufficient to move the data, out-of-the-box config’d TCP stacks can’t deliver this much data, and consumption via typical synchronous GET request handling isn’t applicable. So we’ve built our own proprietary data handling mechanisms to capture and process mass amounts of realtime social data for our clients.
Twitter is just one example. We’re seeing more activity on today’s popular social media platforms and a simultaneous increase in the number of popular social media platforms. We’re dedicated to seamless social data delivery to our enterprise customer base and we’re looking forward to the next data processing challenge.