Tweeting and more for public health: Q&A with a digital disease detective

Screenshot of H7N9 patient record posted to Weibo
A photo of the record of a Chinese patient with H7N9 flu. Posted to Chinese social media site Weibo, the photo has spread like wildfire over Twitter and other social media. (Weibo user @phoenix via Twitter user @Laurie_Garret)
From the flu to cholera, obesity to vaccine concerns, data from Twitter, Facebook, mobile phones, search engine queries and other web-based sources are changing the nature of epidemiology, public health surveillance and outbreak preparation and response.

John Brownstein, PhD, director of the Computational Epidemiology Group in Boston Children’s Informatics Program and co-founder of HealthMap, recently co-authored an opinion piece in the New England Journal of Medicine (NEJM) highlighting the roles of social media and other Internet data sources in what he calls “digital epidemiology” or “digital disease detection,” He and his collaborators argue that, in their opinion:

“Since the [2003] SARS outbreak, the world has seen substantial progress in transparency and rapid reporting. The extent of these advancements varies, but overall, digital disease surveillance is providing the global health community with tools supporting faster response and deeper understanding of emerging public health threats.”

Vector sat down with Brownstein to discuss digital epidemiology’s evolution over the 10 years since SARS, especially in light of the rise and spread of avian H7N9 influenza in China and Middle Eastern Respiratory Syndrome coronavirus (MERS-CoV) in Jordan and the Arabian Peninsula.

Q: In your NEJM opinion, you focus on the benefits digital data bring to public health. Can you summarize them?

HealthMap map of MERS-CoV reports.
A map of recent MERS-CoV news on HealthMap as of July 22, 2013.
A: Data from a variety of digital sources can help epidemiologists and public health authorities better understand and study the dynamics of infectious diseases, how they emerge and spread within a population and beyond. In particular, these data can help provide:

  1. early detection of outbreaks
  2. continuous monitoring of disease
  3. assessments of relevant behaviors and sentiments regarding disease control, such as vaccination
  4. a means for retrospectively examining events prior to an outbreak’s emergence that could have contributed to the outbreak or given hints as to whether the disease infection actually arose earlier than thought

Q: How would you compare digital surveillance in the SARS era with that for H7N9 flu today?

A: Digital epidemiology has come a long way. The kinds of data available have grown exponentially, as has the access to those data. With SARS, there was very limited data available, and you really had to dig deep into the Web to find out what was going on.

With H7N9, we’re drinking from a fire hose. We’re seeing amazing spread and volume of data over social channels coming out of China, especially through new sources like Weibo, a Chinese social media site analogous to Twitter. Remember, it was a hospital worker who broke the news to the world that H7N9 was causing human infections by sharing a photo of a patient’s medical record on Weibo. At HealthMap, we now have five curators working around the clock just on Chinese data feeds.

Q: How about MERS-CoV?

A: The MERS-CoV situation is very different and is closer to that with SARS. Internet penetration is high in the Middle East, but there isn’t the same capability to post and share ground-level medical and health information. The data stream is there, and epidemiologists are benefitting from it, but the depth and breadth aren’t what we’d like to see. Like with SARS, it’s not as complete, and we’re finding things out after the fact.

Q: We have Web-based news outlets, mobile text data, social media. Are other data sources coming around the bend?

“Events like the Arab Spring are driving people to want more Internet freedom. More freedom makes digital epidemiology more effective.”

A: There’s room for all kinds of data. We’re particularly interested in crowdsourced data because it’s more structured, lets you answer questions more directly and has great potential given the right kind of incentive structure. For instance, HealthMap lets users directly submit disease reports via the Web or the Outbreaks Near Me app. The Flu Near You project asks volunteers to submit weekly reports on their flu vaccine status and any flu-like symptoms they experience.

We’re not the first to look at crowdsourcing. Projects like Crisis Mappers and Ushahidi, for example, rely on text messages and other user-submitted data to track and map human conflicts and natural disasters that have a significant public health impact—think post-election violence in Kenya in 2008 or the 2010 Haiti earthquake.

There are other data sources out there as well. Kamran Khan, MD, MPH, and his Bio.Disapora team are mining airline itineraries for signals that could alert them to disease spread via international travel. We’re working with the Wildlife Conservation Society to track reports of international wild animal trafficking, which can promote the spread of important zoonotic and veterinary infections. Google search query data have helped us examine whether the S-CHIP tobacco tax affected smoking rates, and Facebook user interests and “likes” are teaching us about localized obesity rates.

Q: Do you see public health agencies like the Centers for Disease Control and Prevention (CDC) or World Health Organization (WHO) starting to formally incorporate digital data into their surveillance programs?

Twitter screenshot tweets tagged #H7N9
Tweets hashtagged #H7N9 as of 4:45 in the afternoon on July 22, 2013. Within 30 seconds of taking this picture, another nine tweets had been posted. (Click to enlarge)
A: These data are definitely becoming more accepted in public health practice. We know the CDC, for instance, is actively working on ways to feed social and digital data into their surveillance operations.

I don’t see digital techniques ever replacing traditional surveillance methods. There are still too many barriers to data access, especially in developing countries.

But those barriers existed in China 10 years ago, and now it’s a different game there. Access to mobile, social and Internet technologies is on the rise globally and at a rapid pace. Events like the Arab Spring are driving people to want more Internet freedom. More freedom makes digital epidemiology more effective.

Q: How do you think digital epidemiology will evolve over the next decade?

A: There is a lot of interest in developing diagnostic systems that could collect data from the field, like blood tests results or body temperature via mobile phones. And the “quantified self” movement is growing by leaps and bounds. We need to find out how to tap those data for health surveillance.

We also need to move from event detection to behavior detection. There are lots of things people disclose online about their attitudes toward different subjects, like concerns about vaccinations, something Marcel Salathé, PhD and his group at Penn State are working on.

Q: You recently co-authored a paper in PLOS Currents Disasters about Twitter and the Boston Marathon bombings, where you and your collaborators argue that social media has “…a role…in the early recognition and characterization of emergency events.” Do you think digital data can play similar roles in digital epidemiology and disaster response?

A: You’re basically utilizing the same resource—the Web—during an acute situation. Whether it’s an outbreak or a disaster, people will turn to the Web to search for information or to report the situation. These data and behaviors could fuel incredible situational awareness systems and provide intelligence unlike anything that traditional sources could achieve.

On September 18, 19 and 20, HealthMap will co-host the 2nd International Conference on Digital Disease Detection in San Francisco. To learn more, visit the conference website or search Twitter for the hashtag #digdisdet.