Public health researchers are already using social media to track the rise and spread in real time of infectious diseases. Flu season, for instance, now announces itself each winter with a spike in Google searches for the term and its related symptoms. Using similar data, we can also trace the geography of Denge Fever. But chronic disease has been a different matter. There's no mapping the spread of obesity on Twitter, or identifying an outbreak of high Body Mass Index through Google search queries.
There is, however, one place on the Internet where people everywhere – in really fine-grained detail – chronicle the ongoing conditions of their daily lives: Facebook. Your Facebook page is in many ways a reflection of you (or the you that you want your friends to see). In theory, it should also reflect whether you're out running marathons and bookmarking Cooking Lite recipes, or whether you spent the last five weekends on the couch with the remote.
"We know there’s some relationship between activity in the real world and obesity and being sedentary," says Rumi Chunara, an instructor with the Boston Children's Hospital Informatics Program and the Harvard Medical School. She's the lead author on a new study, just published online at PLOS ONE, that attempts to draw yet another relationship between obesity and real-world activity and what we say about it on the Internet. "We went in thinking that perhaps there’s some parallel relationship about the online environment and real world health outcomes."
The researchers were able to look at data on more than 57 million Facebook users in the United States, and more than 8 million people claiming some connection on Facebook to New York City, unheard-of sample sizes for a public health study. Using aggregated, anonymized data that Facebook packages for its advertisers, they then studied the interests people express – through status updates, likes, profile categories – in either outdoor fitness, and health and wellness on one hand, and television on the other. Facebook prepackages this data, smashing together your status update about your daily run with the fact that you "liked" a boxing gym. Similarly, all your commentary on Oblivion gets categorized with the eight cable drama series you've favorited.
(Of note: The study did not look at "sports," since most of us don't actually play baseball; we watch it while drinking beer.)
The researchers then cross-referenced this information with geo-tagged survey data on Body Mass Index from the Centers for Disease Control and Prevention (nationwide) and from the EpiQuery Community Health Survey in New York City. The result? Neighborhoods in New York, and cities nationwide, with a higher percentage of active interests on Facebook in fact had lower obesity rates. The opposite was true of people who seemed on Facebook to be really into TV.
Facebook aggregates this data down to the zip code level, meaning that analyses like these might be used by public health officials to figure out where to target interventions.
"It’s a new information source for understanding where obesity is," Chunara says. The idea isn't that we should replace traditional health surveys with social media data. "But how can we use these to augment what we already know about populations?"
In New York, the Northeast Bronx neighborhood had the highest rate of TV-related interests. And the obesity rate there is 27.5 percent higher than the New York neighborhood with seemingly the least interest in TV, Greenpoint. These maps from the study show the full results for New York:
Similar patterns emerge from national data collected in April and May of last year: Obesity rates were 12 percent lower in Coeur d'Alene, Idaho – where the highest percentage of people on Facebook expressed active interests – relative to the city with the lowest interest in doing something other than watching TV (Kansas City). The Myrtle Beach area of South Carolina had the highest rates of TV interest, and Eugene, Ore., the lowest. The obesity rate in Myrtle Beach is almost 4 percent higher than Eugene.
Of course, TV interest and mention of active pursuits are rough proxies for healthy behavior (it's also possible that people who say they run on Facebook don't actually run anywhere that in real life). But these findings suggest that a more nuanced survey of Facebook data might be used to help understand the health of different communities. Layer data about the built environment atop what we might learn from these digital communities, and the picture may get even clearer. And it's at least cheaper to study community health this way than to call up 57 million people.