Archive for August, 2008

Incremental Collection Updates

August 22nd, 2008
  • Tags: ,
    Posted by Jud Valeski, Co-Founder and CEO in Product
No Comments

The Gnip API supports incremental collection updates. We’ve supported this for awhile, but we didn’t do a good job communicating when it came out. Several folks are taking advantage of it, but over the past few days it’s become clear not everyone knows the functionality exists. Please see “Collection Updates” in the API doc for details.

Delicious 2 is Yummy.

August 8th, 2008
No Comments

Gnip now has Delicious v2 data flowing through it. The delicous bookmarking data flowing through the system now includes bookmarking/tagging done via delicious plugins/API tools (e.g. toolbar buttons). Nice, clean and pure stream of data from delicous now. Enjoy!

Garbage In, Garbage Out

August 2nd, 2008

Gnip is an intermediary service for message flow across disparate network endpoints. Standing in the middle allows for a variety of value adds (Data Producers can “publish once, distribute to many,” Data Consumers can enjoy single service interaction rather than one-off’ing over and over again), but the quality of data that Data Producers push into the system is fundamental.

Only As Good As The Sum Of Our Parts

Gnip doesn’t control the quality of the data being published to it. Whether it comes in the form of XMPP messages, RSS, or ATOM, there are many issues that can come into play that can affect the data a Data Consumer receives.

  • Bad transport/delivery – The source XMPP, RSS, ATOM, or REST, feed can go down. When this happens for a given Publisher, that source has vanished and Gnip doesn’t receive messages for that Publisher. We’re only as good as the data coming in. While Gnip can consume data from XMPP, RSS, ATOM, and other sources, our preferred inbound message delivery method is via our REST API. Firing off messages to Gnip directly, and not through yet another layer, minimizes delivery issues.
  • Bad data – As any aggregator (Friend Feed, Social Thing, MoveableType Activity Streams…) can attest, the data coming across XMPP, RSS, and ATOM feeds today is a mess. From bad/illegal formatting, to bad/illegal data escaping, nearly every activity feed has unique issues that have to be handled on a case by case basis. There will be bugs. We will fix them as they arise. Once again, these issues can be minimized if Data Producers deliver messages directly to Gnip via our REST API.
  • Bad policy – This one’s interesting. Gnip makes certain assumptions about the kind of data it receives. In our current implementation we advertise to Data Consumers that Data Producers push all public, per user, change notifications generated within their systems, to Gnip. This usually corresponds to the existing public API policies for said Data Producers. We will eventually offer finely tuned, Data Producer controlled, data policies, but for today’s public facing Gnip service, we do not want to see Data Producers creating publishing policies specific to Gnip. Doing so confuses the middle-ware dynamic we’re trying to create with our current product, and subsequently muddies the water for everyone. Imagine a Data Consumer interacting with a Data Producer directly under one policy, then interacting with Gnip under another policy; confusing. Again, we will, perhaps earlier than we think, cater to unique data policies on a per Data Producer basis, but, we’re not there yet.

While addressing all of these issues is part of our vision, they’re not all resolved out of the gate.

Follow Gnip

Archive

Recent Posts
Categories
Tags
Blogroll

Recent Tweets

  • # {New Product Feature} Enhanced Filtering for PowerTrack http://t.co/zVgJUY6H More precise filtering options for the Twitter firehose!
  • # Feasting on whale carcasses http://t.co/espZtpNL Twitter and Facebook, Why Twitter Might Be Worth More In The Long Run @pointsnfigures
  • # You learn something new every day http://t.co/oWsf08om - 8 Crazy Things IBM Scientists Have Learned Studying Twitter
  • # Full firehoses that ensure 100% coverage in realtime http://t.co/R03nlExx More details on our partnership with Automattic on the @gnip blog
  • # Likes from WordPress & IntenseDebate now available http://t.co/kRoBM2W4 "Automattic is an important source in the social data mix" @radian6