Author: Jim Moffit, Developer Advocate

Jim Moffitt is a developer advocate at Twitter, where he helps developers and customers build amazing things with Twitter data. Before Twitter, Jim developed software for real-time weather monitoring and early-warning systems. In his free time, Jim loves hanging out with his family, enjoying the Colorado outdoors, travel, photography, gardening, and other random hobbies.

Tweeting in the Rain, Part 4: Tweets during the 2013 Colorado flood

In August 2013, we posted two “Tweeting in the Rain” (Part 1 & Part 2) articles that explored important roles social data could play in flood early-warning systems. These two posts focused on determining whether there was a Twitter “signal” that correlated to local rain measurements. We looked at ten rain events from 2009-2012 in six different regions of the country, including San Diego, Las Vegas, Louisville and Boulder. That analysis demonstrated that even early in its history, the Twitter network had become an important broadcast channel during rain and flood events.

Around noon on Wednesday, September 11, 2013, we posted Part 3, which discussed the opportunities and challenges social networks provide to agencies responsible for early warning systems. As that day unfolded, the rainfall steadily intensified enough that it was becoming more clear that this weather event had the potential to become serious. By midnight, the Boulder County region was already in the midst of a flood event driven by a historic amount of rain. When the rain had tapered off 24 hours later, rain gauges in the Boulder area had recorded 12-17 inches. This happened in an area that expects around 20 inches per year on average.

On the evening of September 11, we stayed up late watching the flood and its aftermath unfold on Twitter, 140 characters at a time. As written about here, we witnessed Twitter being used in a variety of ways. Two key opportunities that Twitter provided during the event were:

1. The ability for the public to share photos and videos in real-time.

2. A medium for local emergency and weather agencies to broadcast critical information.

As we approached the one-year anniversary of the flood, we wanted to revisit the “Tweeting in the Rain” blog research and take a similar look at the 2013 flood with respect to the Twitter network. For this round, we wanted to investigate the following questions:

  • How would the Twitter signal compare to these historic rain measurements?
  • How would the Twitter signal compare to river levels?
  • As the event unfolded, did the Twitter audience of our public safety agencies grow? How did official flood updates get shared across the network?

With these questions in mind, we began the process of collecting Tweets about the flood, obtained local rain and water level data, and started building a relational database to host the data for analysis. (Stay tuned over at dev.twitter.com for a series of articles on building the database schema in support of this research.)

A flood of Tweets

Below are some selected Tweets that illustrate how the 2013 Colorado Flood unfolded on Twitter. A year later, these messages help remind us of the drama and crisis severity that occurred throughout the region.

Earlier in the day, weather followers likely saw the early signs of above-average amounts of moisture in the area:

By that night, all local public safety agencies ramped up to manage a regional natural disaster:

At 10:02 p.m. MT, the Boulder County Office of Emergency Management (@BoulderEOM) posted the following Tweet:

As we approached midnight, this flood event was getting really scary:

A unique role that Twitter and its users played throughout the flood event was the real-time feed of photos and videos from across the region:

By Friday, September 13, the historic amounts of rainfall had affected a wide area of Colorado. In foothill communities like Jamestown and Lyons, the immediate danger were torrential flash floods that scoured through the town centers.

Further downstream the primary problem was steadily rising waters that pooled in the area for days. Contributing to this were several earthen dams that failed, adding their reservoir contents to the already overloaded creeks and rivers.

Compiling ‘flood’ Tweets

As part of the previous round of analysis, we looked at a 2011 summer thunderstorm that dumped almost two inches of rain on the Boulder area in less than an hour. This intense rainfall was especially concerning because it was centered on a forest fire burn area up Fourmile Creek. Flash flood warnings were issued and sirens along Boulder Creek in central Boulder were activated to warn citizens of possible danger.

For that analysis, we collected geo-referenced Tweets containing keywords related to rain and storms (see here for more information on how these filters were designed). During the 48-hours around that event, there were 1,620 Tweets posted from 770 accounts. Here is how that event’s rain correlated with those Tweets.

For this round of analysis, we added a few more types of filters:

  • Hashtags: As the 2013 Colorado flood unfolded, hashtags associated with the event came to life. The most common ones included #ColoradoFlood, #BoulderFlood, #LongmontFlood, and well as references to our local creeks and rivers with #BoulderCreek, #LefthandCreek and #StVrainRiver.
  • Our Profile Geo enrichment had been introduced since the last round of analysis. Instead of needing to parse profile locations ourselves, we were able to let Gnip’s enrichment do the parsing and build simple rules that matched Tweets coming from Colorado-based accounts.
  • Local agencies and media: Since this was such a significant regional event, we collected Tweets for local public agencies and local media accounts.

We applied these filters to six months of data – from August 10, 2013 to February 10, 2014 – beginning with a period that started before the flood to establish the ‘baseline’ level of postings.

Between September 1-7, 2013, there were less than 8,800 Tweets, from 4,900 accounts, matching our filters. During the first week of the flood, September 10-16, we found over 237,000 Tweets from nearly 63,000 Twitter accounts. (And in the following five months of recovery, there were nearly another 300,000 Tweets from 45,000 more accounts).

Comparing Twitter signals with weather data

As before, we wanted to compare the Twitter signal with a local rain gauge. We again turned to OneRain for local rain and stage data recorded during the event.  (OneRain maintains critical early-warning equipment in the Boulder and Denver metropolitan areas, including the foothills in that region). This time we also wanted to compare the Twitter signal to local river levels. Figure 1 represents hourly rainfall (at the Boulder Justice Center) and maximum Boulder Creek water levels (at Broadway St.) along with hourly number of ‘flood’ Tweets.

Boulder Flood Tweets
Figure 1 – Hourly rainfall, Boulder Creek Levels and Tweets during the Colorado Flood 2013, September 10-17. Tweets matching the flood filters during this period equals over 237,000 Tweets. Those same filters matched less than 8,800 during the September 1-8 “baseline” period.

Twitter users finding information when it is most needed

You can see from the information above that our local public agencies played a critical role during the 2013 Colorado flood. Between September 10-17, the Boulder County Office of Emergency Management (@BoulderOEM) and the Boulder National Weather Service office (@NWSBoulder) posted a combined 431 Tweets. These Tweets included updates on current weather and flash flood conditions, information for those needing shelter and evacuation and details on the state of our regional infrastructure. These Tweets were also shared (Retweeted) over 8,600 times by over 4,300 accounts. The total amount of followers of the Twitter accounts that shared these Tweets was more than 9.5 million.

Twitter offers users the ability to actively update the accounts they want to follow. Knowing this, we assumed that the number of followers of these two local agencies would grow during the flood. To examine that type of Twitter signal, we compared the hourly data new followers and rain accumulation at the Boulder Justice Center. The results of that comparison are shown in Figure 2. These two agencies gained over 5,600 new followers, more than doubling their amount during September 10-16.

Figure 2: Boulder Flood Tweets
Figure 2 – Comparing new followers of @BoulderOEM and @NWSBoulder with rain accumulation. Rain was measured at Boulder Justice Center in central Boulder.

One interesting finding in Figure 2 is there seems to be a threshold of accumulated rainfall at which point Twitter users turn their attention to local agencies broadcasting about the flood. In this case it was around midnight on September 11, after five inches of rain and the start of local flooding. As the event worsened and it became more and more difficult to move around the region, more Twitter users tuned directly into the broadcasts from their local Office of Emergency Management and National Weather Service Twitter accounts.

Even as the region shifted its attention to flood recovery, the information being shared on Twitter was vital to the community. Just as the Twitter network was used in a variety of ways during the flood, it provided a critical broadcast channel as communities grappled with widespread damage. The major themes of Tweets posted immediately after the flood included:

  • Information about the evacuated communities of Jamestown, Lyons and Longmont.
  • Details on shelters and other support mechanisms for displaced residents.
  • Organization of volunteers for cleanup activities.
  • Promotion of charitable organization funds.
  • Regional infrastructure conditions and updates. This article discusses how Tweets helped identify road and bridge damages in closed-off areas.

Based on all of this data, it’s very clear that the Twitter network played an important role during and after the 2013 Colorado flood. The combination of real-time eye-witness accounts and updates from our public agencies made Twitter a go-to source for critical emergency information.

In recognition of this important role, Twitter has introduced Twitter Alerts. This service provides the ability for Twitter users to sign up for mobile push notifications from their local public safety agencies. For any public agency with a mission of providing early-warning alerts, this service can help the public find the information they need during emergencies and natural disasters.

Continue reading

Tweeting in the Rain, Part 3

(This is part 3 of our series looking at how social data can create a signal about major rain events. Part 1 examines whether local rain events produce a Twitter signal. Part 2 looks at the technology needed to detect a Twitter signal.) 

What opportunities do social networks bring to early-warning systems?

Social media networks are inherently real-time and mobile, making them a perfect match for early-warning systems. A key part of any early-warning system is its notification mechanisms. Accordingly, we wanted to explore the potential of Twitter as a communication platform for these systems (See Part 1 for an introduction to this project).

We started by surveying operators of early-warning systems about their current use of social media. Facebook and Twitter were the most mentioned social networks. The level of social network integration was extremely varied, depending largely on how much public communications were a part of their mission. Agencies having a public communications mission viewed social media as a potentially powerful channel for public outreach. However, as of early 2013, most agencies surveyed had minimal and static social media presence.

Some departments have little or no direct responsibility for public communications and have a mission focused on real-time environmental data collection. Such groups typically have elaborate private communication networks for system maintenance and infrastructure management, but serve mainly to provide accurate and timely meteorological data to other agencies charged with data analysis and modeling, such as the National Weather Service (NWS). Such groups can be thought of being on the “front-line” of meteorological data collection, and have minimal operational focus on networks outside their direct control. Their focus is commonly on radio transmissions, and dependence on the public internet is seen as an unnecessary risk to their core mission.

Meanwhile, other agencies have an explicit mission of broadcasting public notifications during significant weather events. Many groups that operate flood-warning systems act as control centers during extreme events, coordinating information between a variety of sources such as the National Weather Service (NWS), local police and transportation departments, and local media. Hydroelectric power generators have Federally-mandated requirements for timely public communications. Some operators interact with large recreational communities and frequently communicate about river levels and other weather observations including predictions and warnings. These types of agencies expressed strong interest in using Twitter to broadcast public safety notifications.

What are some example broadcast use-cases?

From our discussions with early-warning system operators, some general themes emerged. Early-warning system operators work closely with other departments and agencies, and are interested in social networks for generating and sharing data and information. Another general theme was the recognition that these networks are uniquely suited for reaching a mobile audience.

Social media networks provide a channel for efficiently sharing information from a wide variety of sources. A common goal is to broadcast information such as:

  • Transportation Information about road closures and traffic hazards.

  • Real-time meteorological data, such as current water levels and rain time-series data.

Even when an significant weather event is not happening, there are other common use-cases for social networks:

  • Scheduled reservoir releases for recreation/boating communities.

  • Water conservation and safety education.

[Below] is a great example from the Clark County Regional Flood Control District of using Twitter to broadcast real-time conditions. The Tweet contains location metadata, a promoted hashtag to target an interested audience, and links to more information.

— Regional Flood (@RegionalFlood) September 8, 2013

So, we tweet about the severe weather and its aftermath, now what?

We also asked about significant rain events since 2008. (That year was our starting point since the first tweet was posted in 2006, and in 2008 Twitter was in its relative infancy. By 2009 there were approximately 15 million Tweets per day, while today there are approximately 400 million per day.) With this information we looked for a Twitter ‘signal’ around a single rain gauge. Part 2 presents the correlations we saw between hourly rain accumulations and hourly Twitter traffic during ten events.

These results suggest that there is an active public using Twitter to comment and share information about weather events as they happen. This provides the foundation to make Twitter a two-way communication platform during weather events. Accordingly, we also asked survey participants if there was interest in also monitoring communications coming in from the public. In general, there was interest in this along with a recognition that this piece of the puzzle was more difficult to implement. Efficiently listening to the public during extreme events requires significant effort in promoting Twitter accounts and hashtags. The [tweet to the left] is an example from the Las Vegas area, a region where it does not require a lot of rain to cause flash floods. The Clark County Regional Flood Control District detected this Tweet and retweeted within a few minutes.

 

Any agency or department that sets out to integrate social networks into their early-warning system will find a variety of challenges. Some of these challenges are more technical in nature, while others are more policy-related and protocol-driven.

Many weather-event monitoring systems and infrastructures are operated on an ad hoc, or as-needed, basis. When severe weather occurs, many county and city agencies deploy a temporary “emergency operations centers.” During significant events personnel are often already “maxed out” operating other data and infrastructure networks. There are also concerns over data privacy, that the public will misinterpret meteorological data, and that there is little ability to “curate” the public reactions to shared event information. Yet another challenge cited was that some agencies have policies that require special permissions to even access social networks.

There are also technical challenges when integrating social data. From automating the broadcasting of meteorological data to collecting data from social networks, there are many software and hardware details to implement. In order to identify Tweets of local interest, there are also many challenges in geo-referencing incoming data.  (Challenges made a lot easier by the new Profile Location enrichments.)

Indeed, effectively integrating social networks requires effort and dedicated resources. The most successful agencies are likely to have personnel dedicated to public outreach via social media. While the Twitter signal we detected seems to have grown naturally without much ‘coaching’ from agencies, promotion of agency accounts and hashtags is critical. The public needs to know what Twitter accounts are available for public safety communications, and hashtags enable the public to find the information they need. Effective campaigns will likely attract followers using newsletters, utility bills, Public Service Announcements, and advertising. The Clark County Regional Flood Control District even mails a newsletter to new residents highlighting local flash flood areas while promoting specific hashtags and accounts used in the region.

The Twitter response to the hydrological events we examined was substantial. Agencies need to decide how to best use social networks to augment their public outreach programs. Through education and promotion, it is likely that social media users could be encouraged to communicate important public safety observations in real time, particularly if there is an understanding that their activities are being monitored during such events. Although there are considerable challenges, there is significant potential for effective two-way communication between a mobile public and agencies charged with public safety.

Special thanks to Mike Zucosky, Manager of Field Services, OneRain, Inc., my co-presenter at the 2013 National Hydrologic Warning Council Conference.

Full Series: 

Tweeting in the Rain, Part 2

Searching for rainy tweets

To help assess the potential of using social media for early-warning and public safety communications, we wanted to explore whether there was a Twitter ‘signal’ from local rain events. Key to this challenge was seeing if there was enough geographic metadata in the data to detect it. As described in Part 1 of this series, we interviewed managers of early-warning systems across the United States, and with their help identified ten rain events of local significance. In our previous post we presented data from two events in Las Vegas that showed promise in finding a correlation between a local rain gauge and Twitter data.

We continue our discussion by looking at an extreme rain and flood event that occurred in Louisville, KY on August 4-5, 2009. During this storm rainfall rates of more than 8 inches per hour occurred, producing widespread flooding. In hydrologic terms, this event has been characterized as having a 1000-year return period.

During this 48-hour period in 2009, there were approximately 30 million tweets posted from around the world. (While that may seem like a lot of tweets, keep in mind that there are now more than 400 millions tweets per day.) Using “filtering” methods based on weather-related keywords and geographic metadata, we set off to find a local Twitter response to this particular rain event.

 

Domain-based Searching – Developing your business logic

Our first round of filtering focused on developing a set of “business logic” keywords around our domain of interest, in this case rain events. Developing how you filter data from any social media firehose is an iterative process involving analyzing collected data and applying new insights. Since we were focusing on rain events, words with the substring “rain” were searched for, along with other weather-related words. Accordingly, we first searched with this set of keywords and substrings:

  • Keywords: weather, hail, lightning, pouring
  • Substrings: rain, storm, flood, precip

Applying these filters to the 30 million tweets resulted in approximately 630,000 matches. We soon found out that there are many, many tweets about training programs, brain dumps, and hundreds of other words containing the substring ‘rain.’ So, we made adjustments to our filters, including focusing on the specific keywords of interest: rain, raining, rainfall, and rained. By using these domain-specific words we were able to reduce the amount of non-rain ‘noise’ by over 28% and ended up with approximately 450,000 rain- and weather-related tweets from around the world. But how many were from the Louisville area?

Finding Tweets at the County and City Level – Finding the needle in the haystack

The second step was mining this Twitter data for geographic metadata that would allow us to geo-reference these weather-related tweets to the Louisville, KY area. There are generally three methods for geo-referencing Twitter data

  • Activity Location: tweets that are geo-tagged by the user.
  • Profile Location: parsing the Twitter Account Profile location provided by the user.
    • “I live in Louisville, home of the Derby!”
  • Mentioned Location: parsing the tweet message for geographic location.
    • “I’m in Louisville and it is raining cats and dogs”

Having a tweet explicitly tied to a specific location or a Twitter Place is extremely useful for any geographic analysis. However, the percentage of tweets with an Activity Location is less than 2%, and these were not available for this 2009 event. Given that, what chance was there to be able to correlate tweet activity with local rain events?

For this event we searched for any tweet that used one of our weather-related items, and either mentioned “Louisville” in the tweet, or came from an Twitter account with a Profile Location setting including “Louisville.” It’s worth noting that since we live near Louisville, CO, we explicitly excluded account locations that mentioned “CO” or “Colorado.” (By the way, the Twitter Profile Geo Enrichments announced yesterday would have really helped our efforts.)

After applying these geographic filters, the number of tweets went from 457,000 to 4,085. So, based on these tweets, did we have any success in finding a Twitter response to this extreme rain event in Louisville?

Did Louisville Tweet about this event?

Figure 1 compares tweets per hour with hourly rainfall from a gauge located just west of downtown Louisville on the Ohio River. As with the Las Vegas data presented previously, the tweets occurring during the rain event display a clear response, especially when compared to the “baseline” level of tweets before the event occurred. Tweets around this event spiked as the storm entered the Louisville area. The number of tweets per hour peaked as the heaviest rain hit central Louisville and remained elevated as the flooding aftermath unfolded.

 

Louisville Rain Event

Figure 1 – Louisville, KY, August 4-5, 2009. Event had 4085 activities, baseline had 178.

Other examples of Twitter signal compared with local rain gauges

Based on the ten events we analyzed it is clear that social media is a popular method of public communication during significant rain and flood events.

In Part 3, we’ll discuss the opportunities and challenges social media communication brings to government agencies charged with public safety and operating early-warning systems.

Full Series: 

Tweeting in the Rain, Part 1

If you would have told me a few years ago that one day I’d be comparing precipitation and social media time-series data, I would have assumed you were joking.  For 13 years at OneRain I helped develop software and monitoring networks for early-warning systems.  A common theme of the systems I worked with was real-time monitoring of rainfall and water levels and providing alarm and notification services.

Until just a few years ago I viewed social media platforms primarily as a source for entertainment and keeping in touch with family and friends.  That is until the Arab Spring started in 2010…. And the 2011 earthquake and tsunami that struck Japan… And Hurricane Sandy in 2012…

It became obvious that social media networks were a natural and vital communication channel for any public safety agency. After joining Gnip last winter, and wading into the ‘flood-waters’ of real-time social media firehose data, I decided to investigate Twitter responses to rain events. Would there be a regional Twitter ‘signal’ that correlates to the rainfall measured by a single rain gauge?

So I approached a former colleague at OneRain, Mike Zucosky, about putting together a conference presentation with a general goal of assessing the current and potential use of social media by public safety agencies.

Here are the fundamental questions we wanted to start looking into:

  • How are agencies currently using social media?

  • Is there a correlation between rainfall measurements and local Twitter activities?

    • What keywords are important for hydrology- and weather-based filtering?

    • What strategies are there for geographic-based filtering?  How can you verify your results?

  • Based on this information, what would we recommend to the early-warning system community?

We presented our analysis at the 2013 National Hydrologic Warning Council Conference last June in Jacksonville, FL.  This Conference Highlights Newsletter features our presentation and provides a good overview of our 30-minute presentation.

This series of posts will focus on what we learned about these topics.

Part 1 – Do local rain events produce a Twitter signal?

A fundamental question was whether we would see a temporal correlation between a single rain gauge and the amount of tweets per hour from a region experiencing a significant rain event. While intuitively there is a huge social component to weather, these two “measurement” systems could not be more dissimilar.  One is assumed to be relatively mobile and distributed across a region of many square miles while the other collects its data at a single location. One is driven by human behavior while the other is a simple mechanical device. Clearly, this exercise was going to be a great test of methods to filter Twitter data by geographic metadata.

We started our data collection by surveying several agencies charged with real-time monitoring of hydrological events. As part of the survey we asked about recent rain events of regional significance. We ended up with a list of ten events for the 2009-2013 period. We then searched for a Twitter response to these events using weather-based keywords and geographic filters.

“It’s raining on my vacation”

Two of the events were from the Las Vegas, NV, area. The first event occurred on December 17-23, 2010. The second event occurred on August 22-23, 2012.  Both events caused flash floods, and one resulted in loss of life.

Using weather and geographic keywords and metadata we collected tweets about these events. In an attempt to establish a “baseline” for the amount of background “chatter”, we also applied these ‘filters’ to a period of equal duration soon before the event hit. Figures 1 & 2 present these data, comparing tweets made before (baseline) and during the event.

A couple of observations from the results below:

  • There appears to be a Twitter signal that correlates to a single rain gauge.

  • If this is true, it is possible to efficiently geo-reference weather-related tweets.

  • The maximum hourly rate of weather-event tweets increased significantly over the 2010-2012 period. In 2010 there was a maximum of 480 tweets/hour and in 2012 the peak was 3,800 tweets per hour.

Las Vegas activity

Figure 1 – Las Vegas event 1, December 17-23, 2010. Event had 21,184 activities, baseline had 9,152.

 

 

 

 


Las Vegas Event 2

Figure 2 – Las Vegas event 2, August 22-23, 2012. Event had 23,953 activities, baseline had 8,704.

 

 

 

 

Find out how we made this work in the upcoming post, Tweeting in the Rain: Part 2.

Full Series: