25 Million Free Tweets on Power Track

Last week we announced Twitter firehose filtering. This week we’re celebrating the news with Free Tweets for all. Sign up by February 28th to enjoy no licensing fees on your first 25 million Tweets in your first 60 days using Power Track. 

Power Track offers powerful filtering of the Twitter firehose, guaranteeing 100% Tweet delivery. For instance, filter by keyword or username to access all Tweets that match the criteria you care about and have all of the matching results delivered to you in realtime via API. Power Track supports Boolean operators, can match your filtering criteria even within expanded URLs, and has no query volume or traffic limitations, helping you access all of the data you want. And it’s only available from Gnip, currently the only authorized distributor of Twitter data via API.

The licensing fee for Power Track is $.10 per 1,000 Tweets, but we’re waiving that fee for the first 25 million Tweets in 60 days for Power Track customers who sign up by February 28th. 1-year agreement and Gnip data collector fee still required.

Learn More or Contact Us to start testing Power Track for firehose filtering. Cheers!

Gnip to power 301works.org

Every once in awhile there are opportunities to make a real difference in the industry.   301works.org is just such an opportunity for Gnip and the companies we are teaming up with to launch a very needed independent URL mapping directory service.

First, many thanks to Adjix, awe.sm, betaworks, bit.ly, Cligs,  URLizer, and urlShort who have joined with us to launch this new organization.  And. you can read the actual 301works announcement that is posted on Gnip.com

Why is Gnip involved?  We are part of the Internet software community and most of us are also active social media users.   While there is debate on the pros and cons of short URLs in some parts of the industry it is obvious that over the last few years there has been a huge growth in the adoption of short URL formats across the web and increasingly custom short URLs are being used by businesses and individuals.

People generate short URLs everyday and they need to know that these mappings will continue to function as they were intended when generated, that their mappings will be available for them to use in the future, and that their privacy preferences will be respected. With short URL formats having reached general acceptance by millions of users in their daily activities there was a need for the industry to ensure the connections provided by these mappings exist over time.

In providing the technology to power the 301works solution Gnip is ensuring that the social connections and data represented by the millions of URL mappings done everyday continue to be available across the web.

We are thrilled to be able to participate and to do our part in helping sustain and grow an open web.

Controlling Data Through URL Shorteners

I’m going to sidestep the “URL shorteners are bad because they obfuscate” discussion in this post. If you’re reading this, you likely have an opinion one way or another on that topic, but let’s leave that at the door. A bigger challenge is emerging as URL shortening continues to proliferate.

Web browsers unwinding a shortened URL when a user clicks on one is one thing, but when system software tries to unwind/resolve shortened URLs en masse, a problem emerges. The database that binds the short URL to its long version is hidden behind an API that can’t handle, or won’t allow (and I’m pointing at all of you URL shorteners out there), bulk unwinding of shortened URLs. The result is a bottleneck (the URL shortening services) that prevents “real-time” indexing of otherwise publicly available content. “Classic” offline crawl based search engines (e.g. Google, Y!, etc) will likely unwind in a latent “offline” manner, based on relevance. However, real-time search facilities are faced with trying to unwind large numbers of shortened URLs on the fly, and there doesn’t appear to be a way to accomplish this as the volume/rate of shortened URLs ever increases in daily social activity.

If your business relies on unwinding large volumes of shortened URLs in real-time, you’re faced with the usual optimization suspects: caching & relevance/prioritization based resolution. These will improve your ability to “keep up”, but they are a function of cache/hit ratios (which are generally poor in the social space when it comes to URL unwinding) and your own ability to decide what to unwind in an ever increasing volume of shortened URLs.

The result is another case of data control. If URL shortener & vanity host/URL adoption continues, and all URLs turn into redirects, we’ve become completely dependent on services that appear to be unwilling to open up their databases. I would appreciate part of this emerging standard including the ability to unwind in bulk.