Profile Geo: When You Need More Geodata In Your Twitter Data

Sometimes in the world of social data it is hard to grasp the amazing possibilities when we use words to describe things. The old adage that a picture is worth a thousand words is true, so we wanted to show you what our new Profile Geo enrichment does.

First, here is what Profile Geo is:
Gnip’s Profile Geo enrichment significantly increases the amount of usable geodata for Twitter. It normalizes unstructured location data from Twitter users’ bio locations and matches those latitude/longitude coordinates for those normalized places. For examples, everyone who mentions “NYC,” New York City,” “Manhattan,” and even some odd instances like “NYC Baby✌” all get normalized to “New York City, New York, United States” so they’re easy to map.

Now, here is what Profile Geo does in practice for users interested in Twitter geodata:
Football Geo

We think this is really powerful stuff. These maps were created using 2 sets of Tweets taken over 3 Sundays where we were looking for Tweets containing the term “football.” The map for Standard Geo is comprised of Tweets where users specifically geotagged their Tweet with their latitude and longitude (natively in the Twitter payload). The map for Profile Geo is comprised of Tweets where Gnip was able to enrich additional Tweets and assign the Tweet to a latitude and longitude.

As you can see the amount of location data available through Profile Geo is significantly higher than through Standard Geo. To be specific, we did our “football” search using the Decahose, a random sampling of 10% of the full Twitter firehose. Standard Geo returned just under 3,000 Tweets, while the Profile Geo search returned more than 40,000 Tweets! (Multiply those by 10 to get approximations of firehose volumes) With this additional geodata the possibilities are limitless. The NFL can understand the demographics of their demand better, football clubs in the UK can see how far their reach is, TV networks can use this data to tailor media, among infinite other uses.

If you were to remove the search for “football” and use the entire firehose of Twitter data you’d find that you can receive roughly 15 times the amount of geo-relevant data by using Gnip’s Profile Geo enrichment instead of just the geodata in the standard stream. Anyone using geodata in their social data analyses should find great value in this dramatic increase in georelevant data.

If images are better than words, then interactive maps are better than images. Here are the maps so you can play around and see the difference yourself. Zooming in will depict just how much more data is available with Profile Geo in clear detail:
Continue reading

Get More Twitter Geodata From Gnip With Our New Profile Geo Enrichment

Twitter Map - Giant Fans in the US Tweeting from the Stadium

When it comes to analyzing social data, “where” matters. After the topics of conversations, perhaps the strongest connection between social conversations online and the offline world is location. Location is an implicit part of what we do, who we know, what we need, etc. For years now at Gnip, the most requested feature for our existing data products has been “more geodata” to help our customers understand the offline locations that are relevant to online conversations. Today we’re pleased to announce a major step toward meeting that demand: the public beta launch of our new Profile Geo enrichment.

The Profile Geo enrichment is simple. Location data is provided publicly by millions of users in their profiles on social networks, but it’s rarely delivered in a normalized format with consistent latitude/longitude coordinates that are necessary for software to ingest the data and make use of it. The Profile Geo enrichment from Gnip normalizes this data to common geographies (for instance, “NYC,” “Manhattan,” etc. all map to “New York City, NY, US”) and provides latitude/longitude coordinates for those places so it’s easy to plot social data on a map.

Our customers are hungry to analyze Twitter through a geographic lens. As a brand, it can be great to know that people are talking about my brand and products online, but few things make those conversations more actionable than knowing where those conversations are taking place. Do we need to change our marketing campaign in a region? Focus on improving customer service? As government or civil society organizations responding to crises, location is the key to identifying need in an actionable way and then deploying resources effectively. It may be obvious we need clean water and blankets, but where is the most important place to send them?

For this new enrichment, we started with Twitter because it offers the biggest initial gain for our customers. While less than 2% of Tweets in the Twitter Firehose contain latitude/longitude coordinates for Twitter’s “geotagged” Tweets, more than half of all Tweets contain a profile location value from a user. And while just 1% of users generate approximately two-thirds of all geotagged Tweets (according to this helpful paper from our friend Kalev Leetaru and his colleagues), profile location data is much more evenly distributed. In that way, looking at profile location data “democratizes” the data that appear when mapping Twitter content – our customers can now hear from the whole world of Twitter users and not just this 1%.

This new premium enrichment from Gnip provides several key benefits for social data analysis. First, it increases the amount of usable Twitter geodata available for analysis by more than 15x for Twitter. Second, it adds a new kind of Twitter geodata from what may be natively available from social sources. It’s important to think about the three different types of location that exist in social media to understand this benefit.

  • Activity Location: Where the activity (Tweet, Check-in, etc.) directly came from, via GPS signal on a user’s device or association with a known venue location. This is the kind of location that provides latitude/longitude natively in Twitter’s or Foursquare’s firehoses.

  • Profile Location: The place the user provides as their location in their profile. They may or may not be there when posting to a social network.

  • Mentioned Locations: Places the user talks about in a post or check-in. These places may not have anything to do with where the person lives or where the person is when posting, e.g. “I can’t wait for Gnip to open its new office in the Maldives.” (The Maldives in this case might as well be a fictitious place considering the likelihood that will happen.)

Profile location data can be used to unlock demographic data and other information that is not otherwise possible with activity location. For instance, US Census Bureau statistics are aggregated at the locality level and can provide basic stats like household income. Profile location is also a strong indicator of activity location when one isn’t provided.

To get a sense of the impact of the Profile Geo enrichment in practice, we worked with the team at MapBox again to create a map of Tweets about the San Francisco Giants over the past few weeks (PS: check out the other maps we made together if you haven’t seen them). During that time period, over two thousand Tweets occurred at AT&T Park that were geotagged with the activity location. With the addition of the Profile Geo enrichment for the same Tweets, it’s now possible to quickly create a map that shows the relationship between activity location (all in the Park), and profile location – where those people came from to watch the game. Next time the Giants franchise wants to think about tourist attendance numbers, they’ll have a new way to do so. Check it out.

SF Giants Tweets from the stadium (center point of the orange lines) link to the profile locations of those users around the globe, showing how far they traveled. Click on the “USA” toggle to see the whole world. Hover over states/countries to see total counts.

The Profile Geo enrichment is now available to all Gnip customers as an option on their Twitter data products in this beta release. We’re looking forward to seeing how this enrichment changes what can be done with location and social data.

If you’re interested in learning more, please visit or hit us up at