Data Story: How Microsoft Research is Using Social Data to Understand Depression

Sometimes the use cases for social data go far beyond what you would expect is ever possible. Such is the case with Microsoft Research who has done some really groundbreaking work around using social data to study depression and whether you can indicate if someone is depressed by their activity on Twitter. We interviewed Dr. Munmun de Choudhury of Microsoft Research to ask about their research using social data to study mental health, the privacy implications and how social data can improve mental health. Dr. De Choudhury will be joining Georgia Tech’s School of Interactive Computing as an assistant professor this Spring.   

Munmun de Choudhury of Microsoft Research

1. What are the high-level takeaways you found on using Twitter to research depression?

This research direction has revealed for the first time, how social media activity such as on Twitter can reveal valuable indicators to mental health e.g., depression. Twitter has much noise, but it is promising to see that there are signals hidden in there too, that can tell us about important issues as health and lifestyle, both at the level of an individual, as well as the scope of larger populations. The most prominent signals of depression lie on people’s social activity (i.e., to what extent they post, what kind of posts they share, when do they mostly post), their social network structure (e.g., how are they connected to their friends and friends of friends), and the linguistic style of the content they share. That these rather implicit signals (e.g., a person may never explicitly mention they are “depressed”) can indicate people’s mental and behavioral issues was a rather surprise to us; though when we consulted with psychologists (in fact one of the collaborators in this project was a psychologist), we learned that mental health may manifest itself via various nuances in people’s everyday behavior. This gives us hope that observing social media use of people over time—something which is increasingly gaining popularity—can be used to build tools, forecasting algorithms, interventions, and prevention strategies for both individuals themselves as well as policymakers to help them deal with and manage this medical condition in a better way.

2. What are the privacy concerns of studying mental health with social data?

Studying mental health with online social data is extremely attractive, and can have widespread implications in enabling better healthcare; however it comes with its own set of privacy and ethics related challenges which cannot be ignored. A number of questions may arise: Can we design effective interventions for people, whom we have inferred to be vulnerable to a certain mental illness, in a way that is private, while raising awareness of this vulnerability to themselves and trusted others (doctors, family, friends)? In extreme situations, when an individual’s inferred vulnerability to a mental illness is alarmingly high (e.g., if the individual is suicide-prone), what should be our responsibility as a research community? For instance, should there be other kinds of special interventions where appropriate counseling communities or organizations are engaged? That is, finding the right types of interventions that can actually make a positive impact on people’s behavioral state as well as abide by adequate privacy and ethical norms is a research question on its own. I hope this line of work triggers conversations and involvement with the ethics and medical community to investigate opportunities and caution in this regard.

Additionally, as a community, we need to be aware of the limits up to which such inferences about illness or disability can be deemed to be safe for an individual’s professional and societal identity. In a sense, we need to ensure that such measurements do not introduce new means of discrimination or inequality given that we now have a mechanism to infer traditionally stigmatic conditions which are otherwise chosen to be kept private. These and other potential consequences such as revealing nuanced aspects of behavior and mental health conditions to insurance companies or employers make resolution of these ethical questions critical to the successful use of these new data sources and the research direction.

3. If you’re able to identify depressed individuals with social media, what does prevention and intervention look like?

In terms of prevention, the ability to automatically and privately infer concerns in people’s mental health issues can enable health professionals be more proactive and make arrangements that improve the access of at-risk individuals to appropriate medical help. At the same time it can help policymakers in better understanding the incidence of different diseases, such as depression which is extremely underreported and considered socially stigmatic, so that people can benefit from better healthcare practices. Further, the population-scale trends of depression over geography, time or gender may be a mechanism to trigger public health inquiry programs to take appropriate and needful measures, or allocate resources in a timely manner.

In terms of intervention, since our estimates of depression can be made considerably more frequent than conventional surveys such as by the Centers for Disease Control (CDC), the estimates can be utilized time to time to enable early detection and rapid treatment of depression in sufferers. At the individual level, a variety of personalized and private tools may be developed that may help individuals better manage depression as well as help them seek social and emotional support easily.

4. What made you interested in studying “collective human behavior as manifested via our online footprints”?

I have always been fascinated by how new and emergent technologies online (e.g., Facebook and Twitter) are increasingly getting into the mainstream of our lives. While we know and realize that our actions on these platforms serve as a reflection of characteristics in the physical world—e.g., several of our Facebook friends are actually friends in real life, I was actually curious as to whether the increasing use of these platforms is impacting our behavior in some way, that is, if the reverse is true. For example, does it affect the way we emote, interact, or build social ties with others? That is one reason I became very interested in exploring deep into understanding aspects of our behavior based on what we say and what we do online.

The other motivation lies in my inherent penchant to study people. The web and particularly social networks and social media provide us with a very powerful tool to do so, in a way that the behavioral findings are derived non-intrusively from people’s day-to-day activities, and because of the scale of the data are mostly generalizable. Lastly on a humorous note, computer scientists are often labeled to be socially awkward; so perhaps you can assume that this particular computer scientist intends to show that “hey, even we can be socially cool too, and even make sense of your social actions on the web!”.

5. Where do you see the future of health research and social data going?
As people are increasing joining social media sites with a goal to remain connected as well as learn about what is going on around them, there are people who have been using these sites for years now. As Twitter and Facebook’s penetration increases, it would lend us a rich source through which we can observe individual-centric behavior over time, and consequently use those trends to understand when and where unexpected or anomalous behavior, e.g., concerning health issues, may emerge. At the population level, large-scale naturalistic data obtained from the web may provide rich insights into understanding health concerns and health outcomes which may not be possible with traditional survey methods. This is because surveys are often retrospective, and hence lack the immediacy of the context in which policies may be changed or influenced, or interventions made for enabling better healthcare.

Even more so, I hope that in the future, social media use can be leveraged to identify health issues in difficult to reach populations, or populations who would otherwise not reveal a condition due to social stigma. For instance, one of the many challenges hindering the global response to some of the extremely deadly diseases like AIDS is the difficulty of collecting reliable information about the populations who are most at risk for the disease. Since social media use is consistently gaining more and more ground, they might be the new platform wherein activity traces may be utilized to identify, with appropriate privacy policies enforced, particular vulnerable populations, and enable them receive better healthcare and help as need may arise.

If you’re interested in more data stories, please check out our collection of 25 Data Stories for interviews with data scientists from Pinterest, Foursquare, and more!