Data Stories: Rumi Chunara on Identifying Epidemics With Social Data

Data Stories is Gnip’s opportunity to tell the cool stories about the data scientists, data journalists and other people working in social data. This week we’re interviewing Rumi Chunara, Instructor at Harvard Medical School and HealthMap and Big Boulder speaker, about her work using social data to identify epidemics. Rumi has a background in building biological sensors, and that has translated in an interest in using social media data and other informal sources of data to identify epidemics. In addition to her work with HealthMap, Rumi was part of a study showing how Twitter could help identify cholera outbreaks in Haiti. You can follow her on Twitter at @rumichunara.

Rumi Chunara of HealthMap on using social data to identify epidemics

1. You’re currently trying to use social data to identify health epidemics. How did you get started doing this, and what was your career path to get there?

During school I studied engineering, and my research involved building portable bio-sensors. One idea behind this was to making measurements typically done in laboratories possible outside of where that infrastructure is available. I thought the concept is really neat and useful, but realized that the types of tools and technologies we can build will always be changing and getting better. So I decided to start working on how we can bring together all of these novel information sources together, and how we can use them to improve health for populations.

2. What has been the success of HealthMap in identifying epidemics? What do you see as the future of systems like HealthMap?

It’s neat to see how HealthMap has become a pioneer in demonstrating the value of informally collected health information. This has meant demonstrating what types of information can be used and how it can be aggregated and analyzed. By demonstrating that informal sources have added value, whether that is in giving an earlier signal, allowing more detailed understanding of a disease outbreak, or reaching more people, in the future I think we will accept and look to other new sources for health, that in aggregate can become extraordinarily valuable.

3. What are some of the different methods you’ve used to help understand epidemics?

Our research group began using informal data such as Internet search queries, news media and data from mobile phones for monitoring and understanding disease outbreaks. Later, at HealthMap we have expanded to also using other sources such as data from social media. Beyond harnessing existing data and looking for health information, we are also building other systems to specifically ask people about their health, via the Internet. The beauty of this type of surveillance is that it works and has value wherever people can access the Internet, which can be in many parts of the world. As well, social media is suited for understanding the spread of disease because it helps identify where you are and whom you are connected to, and can reach a lot of people, which are all important in spread of disease!

4. Your research work has found that Twitter could have helped identify outbreaks of cholera in Haiti? What were the takeaways on using social data to identify an outbreak after a natural disaster?

We have shown, for a particular outbreak situation, how Twitter could be used to identify the outbreak early on; something that has also been shown before for other disease outbreak situations by other groups. Our work also went beyond this, to demonstrate that the Tweets could further be used to get a sense of how an outbreak is progressing. Some lessons from our study were: learning about reasons why social data can vary during an outbreak (for example, because of an emerging or ongoing public health event, due to media coverage or other environmental events happening at the same time). Also we learned that each situation will be different depending on the context around the event. The biggest lesson we hope comes out of our study is that there is a potential use for these novel types of data, which should be explored more.

5. What do you see as the future of social data in the health world?
Because of all of the benefits we have demonstrated from social data, my view is that it will be useful to harness it in complement to the other existing data sources we have such as traditional case and hospitalization information, not as a replacement for our medical and public health infrastructure. Informal social data can fill gaps in traditionally used sources of data in healthcare, and importantly it will hopefully empower individuals to become more involved with and proactive about their own health!

We appreciate Rumi taking the time to speak with us!  Let us know in the comments if you have a suggestion for another interviewee for Data Stories. 

Using Social Data to Identify Epidemics

Continue reading

Big Boulder: Social Data in Public Service

Panel about the increasing use of social data by government and organizations for public service. Participants included Ian Cairns, Principal of Watershed Strategy; Moeed Ahmad, Head of New Media at Al Jazeera Media Network; Katie Baucom, Geospatial Analyst at National Geospatial-Intelligence Agency and Rumi Chunara, Instructor at Harvard Medical School and Healthmap.

Panel on Social Data in Public Service

Social Data at Al Jazeera

Moeed Ahmad kicked off the panel by talking about the state of the Al Jazeera networks when he started in 2005, essentially it was the army channel. Al Jazeera was launched in 1996 with an Arabic channel, English channel, 20 sports channels and a documentary channel. When Moeed started there was a major shift happening in media and overlooking the serious impact that social media had would have hurt the channel. Al Jazeera was the single voice of the Arab region at a time when most news stations were statewide. Al Jazeera was the outlet for people on the streets to share how they were feeling. Like traditional news across the world, young people had stopped watching news and started getting it online. Al Jazeera recognized they needed to shift and need a dedicated social media team.

During 2008 to 2009 is when Twitter came of age. During this time period, the Iranian elections were happening and Twitter in conjunction with the Iranian election raised new concerns about verification. Other stations were just running the Twitter steam about the election on their networks but weren’t providing context. In addition, now they needed a new way of verifying this type of content. It was hard to tell signal from noise since so many people changed their location to Tehran.

One of the interesting ways Al Jazeera was able to collect people’s stories was during the famine in Somalia and there was a UN Conference to address what was happening. Al Jazeera took a lo-fi method to getting people to tell their stories in a way that was verifiable, they did a simple SMS blast asking people to tell Al Jazeera their stories and had thousands of people responding. This allowed them to create their site Somalia Speaks.

Moeed told a really interesting story about the Arab Spring, and how they were missing what was happening in Tunisia at the time. Al Jazeera prides themselves on reporting news, and they don’t give readers what they want. You won’t find Britney Spears a frequent topic on Al Jazeera. During the beginning of the protests in Tunisia, Al Jazeera was busy telling a story about the Palestine Papers, which they had been working on for the last several months.

One of the important jobs Al Jazeera charges themselves with is to subtract the noise out of social media and add context to the stories being told. They consider verification to be important especially when others are trying to discredit their reporting. With both Syria and Libya, they’ve seen false reports with people to being claimed to being bombed. Reporting on it would have discredited Al Jazeera. One video they were sent they asked their Twitter followers to verify and learned that the video was three years old and from Iraq. Yet social media has still been a formidable source for them and most of their coverage from Syria has come from YouTube or Facebook.

Social Data at National Geospatial Intelligence Agency

Katie Baucom talked about how the National Geospatial Intelligence Agency works on disaster crisis response and helps create damage assessments of natural disasters using satellite imagery. About a year and a half ago they started using social data to help fill in the gaps. Their team is able to start using Twitter text and imagery immediately while satellite images can take half a day or day. With the recent tornados that were hitting Texas, her team could see a lot of imagery from Twitter. An important aspect of this is that they were able to learn about new cities that were struck by the tornados based on Twitter.

One aspect that everyone is trying to figure out is information sharing vs emergency response. When you call 911, it is illegal to make a false emergency call, however the laws haven’t caught up to Twitter. This is no law against falsely asking for help or making a false claim on Twitter.

Her agency is also looking at ways for search and rescue teams to be able to verify information including possibly having a social media feed on their phone and allowing them to verify information and to take a look at aggregate areas.

Social Data to Predict Epidemics

Rumi Chunara talked about her work to predict epidemics using social data and other new types of information. Rumi got her start in working with biosensors but switched her focus on exploring social data and other alternative types of information. She wanted to incorporate as many sources as possible.

For verifying her research, one of her methods is to work with trusted users or to compare information to what physicians on the ground are observing. With a research project they conducted about Twitter and Cholera in Haiti they found that Twitter was a quicker way of detecting Cholera as it was spreading through Haiti after the 2010 earthquake. As part of their research, they compared findings with an infectious disease specialist on the ground. Her team is working on creating improved techniques for predicting new outbreaks. They’re still looking at ways to communicate with users directly.

One way for Rumi and her team to verify information is to use math to identify “false positives.” Another aspect they use is having people use their iPhone app, “Outbreaks Near Me”, to verify information.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.