Data Stories is Gnip’s opportunity to tell the cool stories about the data scientists, data journalists and other people working in social data. This week we’re interviewing Rumi Chunara, Instructor at Harvard Medical School and HealthMap and Big Boulder speaker, about her work using social data to identify epidemics. Rumi has a background in building biological sensors, and that has translated in an interest in using social media data and other informal sources of data to identify epidemics. In addition to her work with HealthMap, Rumi was part of a study showing how Twitter could help identify cholera outbreaks in Haiti. You can follow her on Twitter at @rumichunara.
1. You’re currently trying to use social data to identify health epidemics. How did you get started doing this, and what was your career path to get there?
During school I studied engineering, and my research involved building portable bio-sensors. One idea behind this was to making measurements typically done in laboratories possible outside of where that infrastructure is available. I thought the concept is really neat and useful, but realized that the types of tools and technologies we can build will always be changing and getting better. So I decided to start working on how we can bring together all of these novel information sources together, and how we can use them to improve health for populations.
2. What has been the success of HealthMap in identifying epidemics? What do you see as the future of systems like HealthMap?
It’s neat to see how HealthMap has become a pioneer in demonstrating the value of informally collected health information. This has meant demonstrating what types of information can be used and how it can be aggregated and analyzed. By demonstrating that informal sources have added value, whether that is in giving an earlier signal, allowing more detailed understanding of a disease outbreak, or reaching more people, in the future I think we will accept and look to other new sources for health, that in aggregate can become extraordinarily valuable.
3. What are some of the different methods you’ve used to help understand epidemics?
Our research group began using informal data such as Internet search queries, news media and data from mobile phones for monitoring and understanding disease outbreaks. Later, at HealthMap we have expanded to also using other sources such as data from social media. Beyond harnessing existing data and looking for health information, we are also building other systems to specifically ask people about their health, via the Internet. The beauty of this type of surveillance is that it works and has value wherever people can access the Internet, which can be in many parts of the world. As well, social media is suited for understanding the spread of disease because it helps identify where you are and whom you are connected to, and can reach a lot of people, which are all important in spread of disease!
4. Your research work has found that Twitter could have helped identify outbreaks of cholera in Haiti? What were the takeaways on using social data to identify an outbreak after a natural disaster?
We have shown, for a particular outbreak situation, how Twitter could be used to identify the outbreak early on; something that has also been shown before for other disease outbreak situations by other groups. Our work also went beyond this, to demonstrate that the Tweets could further be used to get a sense of how an outbreak is progressing. Some lessons from our study were: learning about reasons why social data can vary during an outbreak (for example, because of an emerging or ongoing public health event, due to media coverage or other environmental events happening at the same time). Also we learned that each situation will be different depending on the context around the event. The biggest lesson we hope comes out of our study is that there is a potential use for these novel types of data, which should be explored more.
5. What do you see as the future of social data in the health world?
Because of all of the benefits we have demonstrated from social data, my view is that it will be useful to harness it in complement to the other existing data sources we have such as traditional case and hospitalization information, not as a replacement for our medical and public health infrastructure. Informal social data can fill gaps in traditionally used sources of data in healthcare, and importantly it will hopefully empower individuals to become more involved with and proactive about their own health!
We appreciate Rumi taking the time to speak with us! Let us know in the comments if you have a suggestion for another interviewee for Data Stories.