The results of a study of 82 million tweets from 1,300 counties.
Governments from the local to the national are increasingly interested in "wellbeing," that subjective notion that's harder to measure than per capita income or GDP, that comes closer to capturing what we more vaguely think of as happiness. We'd all like to have it: quality of life, life satisfaction, fulfillment.
As researchers from the University of Pennsylvania and Michigan State University put it in a recent study on the topic, with a technological twist:
Happiness matters. For example, when a sample of Britons were asked what the prime objective of their government should be – “greatest happiness” or “greatest wealth”, 81% answered with happiness (Easton 2006). In a set of other studies conducted around the world, 69% of people on average rate well-being as their more important life outcome (Diener 2000). Psychologists still argue about how happiness should be defined, but few would deny that people desire it.
We typically gauge happiness, among individuals and whole communities or demographics, with survey questions like "how satisfied are you with your life?" But surveys cost money and contain their own biases. And so these academics, led by Johannes Eichstaedt and Andrew Schwartz, began to wonder if they could glean some sense of a community's wellbeing from the firehose of daily updates many of us voluntarily communicate about ourselves on Twitter.
Alexis Madrigal wrote several months ago about an earlier research project that tried something like this, manually coding the "happiness content" of tweets coming from different parts of the country to find the happiest cities in America. This latest study, also described by the authors on the Follow the Crowd research blog, takes a slightly different strategy and also dissects some of the correlates of "wellbeing" embedded in the language of our tweets.
The study examined 82 million tweets, mapped from nearly 1,300 U.S. counties and collected between June of 2009 and March of 2010 (each county had at least 30,000 twitter words geotagged to it). As the researchers found, Twitter can reveal a lot about wellbeing, not just among individuals (that's not such an impressive feat), but at the level of whole communities.
The researchers built a model of language drawn from these tweets that could significantly predict community-level wellbeing, as measured against more traditional results from surveys. Socio-eonomic information about a place is often considered a rough proxy for wellbeing (people tend to be happier when they're not broke). But these researchers found that by combining socio-economic data with this model of Twitter language, they could build a particularly powerful tool for predicting wellbeing, without the use of any formal surveys at all.
This map from the paper shows a measure of life satisfaction using more traditional survey results, in these same 1,300 counties:
And here is a map from the researcher's own predictive model combining socioeconomic factors and Twitter language:
Certain topics encoded in our tweets correlate particularly well with the counties that have high and low wellbeing. Tweets relating to exercise and the outdoors ("training," "gym," "waves," "mountains," "camping") rated positively, perhaps, the researchers suggest, tying back to evidence that exercise reduces the risk of depression.
Also on the high-wellbeing list: a cluster of words related to "ideas," "suggestions," and "advice," signs of people tapping their social networks to problem-solve. Tweets about "meetings" and "conferences" similarly suggest engagement. And tweets mentioning "support" and "donate" hint at pro-social activities that have also been linked to higher life satisfaction. These are some of the high-wellbeing (in green and blue) and low-wellbeing (red) word clusters that emerged from the study:
What's most compelling about the whole paper is that tweets from individual people seem to tell us something not just about their own wellbeing, but about the wellbeing of the places where they live. And this pattern holds even though we know that Twitter is its own self-selecting ecosystem, with an over-representation of young and technologically savvy users. As the researchers explain it (bold emphasis is ours):
The fundamental result of this paper is perhaps surprising: we can predict (on average) the happiness of one set of people (those who answered the [life satisfaction] questionnaires) from the tweets of other people (people in the same county). This is, however, consistent with findings from other methodologies. People in the same county tend to share the same culture and environmental affordances (e.g., hiking, music, or good employment), and attitudes towards them (being excited or bored).
Happiness is asserted to be contagious (Fowler and Christakis 2008) and it has been suggested that although educated people are happier, on average, than less educated ones, there is an even stronger benefit to living in a community of educated people with arts, culture and entertainment (Lawless and Lucas 2011). Thus, the tweets of other people can indicate what it’s like to live around them, influencing one’s own happiness.