Tweeting in the Rain, Part 1

If you would have told me a few years ago that one day I’d be comparing precipitation and social media time-series data, I would have assumed you were joking.  For 13 years at OneRain I helped develop software and monitoring networks for early-warning systems.  A common theme of the systems I worked with was real-time monitoring of rainfall and water levels and providing alarm and notification services.

Until just a few years ago I viewed social media platforms primarily as a source for entertainment and keeping in touch with family and friends.  That is until the Arab Spring started in 2010…. And the 2011 earthquake and tsunami that struck Japan… And Hurricane Sandy in 2012…

It became obvious that social media networks were a natural and vital communication channel for any public safety agency. After joining Gnip last winter, and wading into the ‘flood-waters’ of real-time social media firehose data, I decided to investigate Twitter responses to rain events. Would there be a regional Twitter ‘signal’ that correlates to the rainfall measured by a single rain gauge?

So I approached a former colleague at OneRain, Mike Zucosky, about putting together a conference presentation with a general goal of assessing the current and potential use of social media by public safety agencies.

Here are the fundamental questions we wanted to start looking into:

  • How are agencies currently using social media?

  • Is there a correlation between rainfall measurements and local Twitter activities?

    • What keywords are important for hydrology- and weather-based filtering?

    • What strategies are there for geographic-based filtering?  How can you verify your results?

  • Based on this information, what would we recommend to the early-warning system community?

We presented our analysis at the 2013 National Hydrologic Warning Council Conference last June in Jacksonville, FL.  This Conference Highlights Newsletter features our presentation and provides a good overview of our 30-minute presentation.

This series of posts will focus on what we learned about these topics.

Part 1 – Do local rain events produce a Twitter signal?

A fundamental question was whether we would see a temporal correlation between a single rain gauge and the amount of tweets per hour from a region experiencing a significant rain event. While intuitively there is a huge social component to weather, these two “measurement” systems could not be more dissimilar.  One is assumed to be relatively mobile and distributed across a region of many square miles while the other collects its data at a single location. One is driven by human behavior while the other is a simple mechanical device. Clearly, this exercise was going to be a great test of methods to filter Twitter data by geographic metadata.

We started our data collection by surveying several agencies charged with real-time monitoring of hydrological events. As part of the survey we asked about recent rain events of regional significance. We ended up with a list of ten events for the 2009-2013 period. We then searched for a Twitter response to these events using weather-based keywords and geographic filters.

“It’s raining on my vacation”

Two of the events were from the Las Vegas, NV, area. The first event occurred on December 17-23, 2010. The second event occurred on August 22-23, 2012.  Both events caused flash floods, and one resulted in loss of life.

Using weather and geographic keywords and metadata we collected tweets about these events. In an attempt to establish a “baseline” for the amount of background “chatter”, we also applied these ‘filters’ to a period of equal duration soon before the event hit. Figures 1 & 2 present these data, comparing tweets made before (baseline) and during the event.

A couple of observations from the results below:

  • There appears to be a Twitter signal that correlates to a single rain gauge.

  • If this is true, it is possible to efficiently geo-reference weather-related tweets.

  • The maximum hourly rate of weather-event tweets increased significantly over the 2010-2012 period. In 2010 there was a maximum of 480 tweets/hour and in 2012 the peak was 3,800 tweets per hour.

Las Vegas activity

Figure 1 – Las Vegas event 1, December 17-23, 2010. Event had 21,184 activities, baseline had 9,152.

 

 

 

 


Las Vegas Event 2

Figure 2 – Las Vegas event 2, August 22-23, 2012. Event had 23,953 activities, baseline had 8,704.

 

 

 

 

Find out how we made this work in the upcoming post, Tweeting in the Rain: Part 2.

Full Series: