Some of the most compelling use cases we’ve seen for analyzing Twitter data involve geolocation. From NGO’s looking at geotagged Tweets to help deploy resources after disasters, to brands paying attention to where their fans are (or their disgruntled customers) to help drive engagement and marketing strategies, location adds key value to Tweet content.
We’ve been fascinated by these use cases and have wondered what else could be done with this data. A couple months ago our Data Science team set out to explore these questions, and to create some resources at the same time that would help others study and make use of geotagged Tweets. We brought in the team at MapBox – including data artist Eric Fischer – to help us dig into the data and visualize what we found in fast, fully navigable maps that would let us and our readers really explore this data in depth.
The interactive maps we created together build on other recent analyses and visualizations of Twitter data done by others, including this great post about details of the data and these static maps from Twitter’s Visual Insights team. The results are stunning, and we hope they’re helpful for you to make the data more practical and accessible as you evaluate what else you could be doing with geolocation in Twitter.
Locals and Tourists (Round 2)
Where do people tweet relative to where they live?
In 2010, Eric Fischer made a static map he called “Locals and Tourists” that showed geolocation for both Tweets and Flickr photos side by side, with the data color coded to show when a post was by a “local” (a post at or near the user’s stated home location) or a “tourist” (a post far from the user’s home location). Twitter has matured significantly since then, and we wanted to see what we could learn from looking at just the Twitter data today, with the ability to browse at any local level around the world. We gathered a sample of Twitter data with unique geotagged Tweet locations from the past ~18 months to generate this new interactive map.
As the dynamic maps took shape, the new version of “Locals and Tourists” impressed us in a couple ways. The first was simply how much resolution Twitter data provides. For instance, not only were primary and secondary roads clearly visible, but you can clearly see roads taken by tourists vs. roads used for local commutes, like this screenshot of I-95 snaking past Wilmington, DE and Philadelphia, PA in red across the bottom third of this image:
You can also clearly see the outlines of buildings like airports, sports stadiums, and major shopping malls that are frequented by tourists. Dig into your local area and see for yourself.
This map could be a resource for city planners, the travel industry, or for creative marketers thinking about how to localize their mobile advertising for different audiences.
Device Usage Patterns
This map shows off usage patterns for various mobile operating systems used to tweet around the world. Since geotagged Tweets require a Twitter client that includes GPS support, most geotagged Tweets come from handheld devices – and we can look at exactly which client was used in the “generator” metadata field provided by Twitter. Among other things, this visualization suggests correlations between mobile OS and income level in the US, and highlights just how prolific Blackberry use is in Southeast Asia, Indonesia and the Middle East.
Languages of the World
Using the same data sample, this final visualization plots where people tweeted in various languages, using metadata from the Gnip Language Detection Enrichment and the Chromium Compact Language Detector as a fallback.
For starters, this map makes clear that English is still the dominant language on Twitter around the world — toggling to the English-only view reveals nearly as much resolution in the global map as when all languages are enabled:
What might come as more of a surprise though is just how many other languages are being spoken frequently, and particularly how much overlap there is in the United States:
Non-English Tweets across the US; Spanish in green
A Note on the Data
These maps are created with a data set that was significantly culled down to remove locations that would create visual noise. From the original data set, the following were removed:
- Multiple geotagged Tweets in the exact same location (we made no attempt to communicate density in these visualizations)
- Geotagged Tweets from the same user in very close proximity to other Tweets from the same user
- Geotagged Tweets from known or detectable bots
Together these maps point to something powerful – by looking at geolocation data from Twitter in the aggregate, important understanding can be gained to drive marketing, product development, crisis response, or even inform research and policy decisions. In the coming weeks, we’ll be digging in deeper here on the blog to explore other important aspects of geolocation in social data that we hope together will build a picture of the opportunity that exists in understanding social data geospatially.
Find something compelling here or in any of the other maps? Tell us with a Tweet: @gnip.