Copyright © 2010 Gnip, inc.
Gnip makes it easy to build social media tracking tools.
Checkout our job posting here. If you think you fit the bill, let us know; we want to talk to you.
Please no recruiter or 3rd party inquiries.
I had to go back into older blog posts to remind myself when we launched; July 1st. It feels like we’ve been live since June 1st.
Things have gone incredibly well from an infrastructure standpoint. We’ve had to add/adjust some system monitoring parameters to accommodate the variety of Data Producers publishing into the system; different frequencies/volumes call for for specialized treatment. We weren’t expecting the rate, or volume, of Collection creation we wound up with. Within three hours of going live, we had enough Collections in the system to adversely impact node startup/sync times. We patiently tuned our data model, and tuned TerraCotta locks to get things back to normal. It’s looking like we’ll be in bed with TerraCotta for the long haul.
I’m not sure I could be any more pleased with AWS. Our core service is heavily dependent on EC2, and that’s been running sans issues. We’re working on non-Amazon failover solutions that assure un-interrupted service even if all of EC2 dies. Our backups are S3 dependent so we had some behind the scenes issues last weekend when S3 was flaky; see my previous post on this issue. We haven’t had our day in the sun with outages, and I obviously hope we never do, but so far I’m walking around with a big “I <3 AWS” t-shirt on.
On the convenience library front, we (Gnip + community) have made all of our code available on github. We’ve had tremendous community support and contribution on this front; so cool to see; thanks everyone!
Collections are by far the primary data access pattern (as opposed to raw public activity stream polling); not really a surprise.
Summize/Twitter has been a totally cool way to track ether discussion around Gnip. When we notice folks talking about Gnip, positive or negative, we can reach out in “real-time” and strike up a conversation.
That’s all for now.
Thanks to all the Data Producers and Consumers that have integrated with Gnip thus far!
Just got word that the very cool London-based service retagger has started working with Gnip to reduce the number of calls they make to data providers. Here’s what co-founder Nicholas Smit has to say:
Retaggr aggregates your online identity into an interactive embeddable business card. Contained within it is actual content from services like flickr, twitter and so on. We’re using Gnip to receive notifications about when our users publish data on these services, so we don’t have to poll them unnecessarily. This is great for efficiency, alleviates problems with API quotas, and helps us provide a consistent level of service to our users.
Welcome Nicholas and the Retaggr team!
Once you start pushing notifications to Gnip, you should reach out to info@gnipcentral.com and let us know. In addition to adding your company to the official publisher list, we’ll blog about it here, add your logo to the home page and include you in a weekly announcement to the developers list.
In short, make us aware that you have integrated with us and we’ll ask everyone to start integrating with you.
Amazon’s S3 outage over the weekend did not affect Gnip’s live service. Gnip uses S3 for system state archival/backup purposes, but the live data flow through Gnip was not affected as we keep it in local instances (memory/local disk). We weren’t able to backup data while S3 was down, but its outage was intermittent, so during online windows, we did our backups. Eventually the S3 outage was “over” and balance between local-storage and S3/remote storage was restored. At some magical point if S3 simply wasn’t coming back online, we’d move our backups to another service.
Building scalable, redundant, highly-available, systems is the next big game. It actually has been for decades, but now a larger web application audience is becoming accutely aware of its importance, and subsequently, how to accomplish it. At the end of the day, everything fails. The game becomes isolating the weak points, butressing the critical points of your service to ensure “instant” recovery from all the failures you can anticipate, and minimizing complete system setup/restart time in case everything craters and you have to scramble to come back online.
I hope Gnip never has it’s day in the searing outage sun, but we’re not naive.
Brush your teeth before bed, eat right, exercise, and eliminate your Single Points of Failure.
Yes.
They’re pushing to us. Now we’re pushing to you. Twitter notifications can be found here.
Coverage:
Jay Ridgeway (switchabit, bit.ly) and I go back a ways. A handful of us spent a couple years of our life transitioning AOL away from it’s “old way” to the “new way” of doing (proprietary content hosting/infrastructure to modern web stuff). It was fun, and hard. Anyway… we were recently speculating as to why our stuff (gnip, bit.ly, switchabit) is getting so much traction (obviously we’re bias). Without going into some thesis-like blog post, the following paraphrased exchange took place:
Jud: the system is a mess right now
Jay: "we r making band aids and aspirin"
Truer words were never spoken. We’re in the midst of the larger system evolving through its “API phase.” APIs have sprouted up like weeds, and now folks are waking up to their lawns dying. Everyone has spent the last ~8 months talking about how to fix things, and our products are the first crack at tangible tools that are getting traction. To be fair, some big players have tossed “specs”, “frameworks”, and even more APIs into the mix in order to impact some change; helpful, but far from enough. We’re incrementally injecting framework fundamentals into the broader system; join us. Age old concepts applied to a growing ecosystem in need.
“Do, or do not. There is no ‘try.’” – Yoda