At Gnip, we’ve long believed that the “geo” part of social data is hugely valuable. Geodata adds real-world context to help make sense of online conversations. We’re always looking for ways to make it easier to leverage the geo component of social data, and the geo features in our Search API provide powerful tools to do so. Gnip’s Search API now includes full support for all PowerTrack geo operators, which means analysts, marketers and academics can now get instantaneous answers to their questions about Twitter that require location.
In this post, we’ll walk through exactly what these Search API features look like, and then look at an example from the Oscars of where they add real value. We will find the people actually tweeting from inside Hollywood’s Dolby Theater and do deeper analysis of those Tweets.
What It Does
Gnip’s Search API for Twitter provides the fastest, easiest way to get up and running consuming full-fidelity Twitter data. Users can enter any query using our standard PowerTrack operators and get results instantaneously through a simple request/response API (no streaming required).
Using the Search API, you can run queries that make use of latitude/longitude data, specifically with our bounding_box and point_radius rules. Each rule lets you limit results to content within a box or circle with up to 25-mile sides or radius. These rules can be particularly helpful to cut down on the noise and identify the most relevant content for your query.
These rules can be used independently to filter both types of Twitter geodata available from Gnip:
- Twitter’s “geo” value: The native geodata provided by Twitter, based on metadata from mobile devices. This is present for about 2% of Tweets.
- Gnip’s “Profile Geo” value: Geocoded locations from users’ profiles, provided by Gnip. This is present for about 30% of Tweets.
An Oscars’ Example: What apps do the glitterati use to Tweet?
There were 17.1 million Oscar-related Tweets this year. Ellen Degeneres’ selfie Tweet at the Oscars was the most retweeted Tweet ever. If you run a search for “arm was longer” during the Oscars, you’ll get a result set that looks like this — over 1.5 million retweets in the first hour following @theellenshow’s Tweet.
The Tweet created a bit of a stir since it was sponsored by Samsung and Ellen used her Galaxy S5 to tweet it, but then used an iPhone backstage shortly afterward. We’ve seen plenty of interest in the past in the iPhone vs. Android debate, so we thought we’d use the opportunity to check what the real “score” was inside the building. What applications were other celebrities using to post to Twitter?
Using the PowerTrack point_radius operator, we created a very narrow search for a 100 meter radius from the center of the Dolby Theater — roughly the size of the building — for the time period from 5:30-11:30 PM PST on Sunday night. It’s a single-clause query:
point_radius:[-118.3409742 34.1021528 0.1km]
The result set that came back included about 250 Tweets from 98 unique users during that narrow time period, and you can see from the spike during the “selfie” moment that it’s a much more refined data set:
Using this geo search parameter, you can quickly identify the people actually at the Oscars during that major Twitter moment.
Among them, we saw the following application usage:
With all the chatter about the mobile device battles, it’s interesting to see so many Oscar attendees actually posting to Twitter from Instagram versus Twitter mobile clients. Foursquare also made a solid showing in terms of the attendees’ preferred Twitter posting mechanism.
This is just one example of how using Gnip’s geo PowerTrack operators in the Search API can help with doing more powerful social media analysis. For a more detailed overview of all the potential geo-related PowerTrack queries that can be created, check out our support docs here.
P.S. – More on Geo & Twitter Data at #SXSW
If you get excited about social data + geo and will be in Austin this weekend, come check out our session “Beyond Dots on a Map: Visualizing 3 Billion Tweets” on Sunday at 1 pm. I’ll be on stage with our good friend Eric Gundersen from Mapbox talking about the work we did together to visualize a huge set of Twitter data from around the world.