Ads are being blocked

For us to continue writing great stories, we need to display ads.

Un-block Learn more


Please select the extension that is blocking ads.

Ad Block Plus Ghostery uBlock Other Blockers

Please follow the steps below

London's Raucous Babble of Languages

Data engineers examined more than 3 million tweets to create this sprawling linguistic cartography.

Attention, London residents: If your Malay is feeling rusty and in need of conversational oil, try heading to the neighborhood just north of Kensington Gardens. That's where Austronesians are chatting up a storm, according to this fascinating map of London's languages.

The clamorous cartography is the result of nifty computer analysis by Ed Manley and James Cheshire, both students at the University College London. (You might recall Cheshire from his map of London last names.) They used a tweaked Google Chrome algorithm to examine more than 3 million tweets sent by London inhabitants this summer. By the end of their dogged data-sifting, they had detected more than 60 languages including Tamil, Maltese, Tibetan, Urdu and Afrikaans.

With the help of geolocation, they then plotted the 10 most frequently spoken languages to create the colorful and informative metropolis you see below (interactive version here):

(Color code: Spanish-Gray, French-Red, Turkish-Dark Blue, Arabic-Green, Portuguese-Purple, German-Orange, Italian-Yellow, Malay-Turquoise, Russian-Pink.)

Don't be fooled by the apparent diversity of tongues, however. London may be a bubbling crockpot of cultures, but on Twitter most everyone is yammering in English (92.5 percent of 3.3 million tweets, shown as dark-gray dots). Other things worth noting: The people who parler en français hang around hip Notting Hill and the Institut Français. Russian speakers congregate in central London. The blob of multi-language communication northeast of downtown is the Olympic Stadium while the games were taking place. And 1.4 million tweets did not get logged by Manley and Cheshire, presumably because they were too short or their authors had piss-poor spelling.

In a humorous side note, the map's creators decided to exclude Tagalog from their analysis not because of its dis-use, but for its similarity to popular Internet utterances. Says Manley:

One issue with this approach that I did note was the surprising popularity of Tagalog, a language of the Philippines, which initially was identified as the 7th most tweeted language. On further investigation, I found that many of these classifications included just uses of English terms such as 'hahahahaha', 'ahhhhhhh' and 'lololololol'. I don't know much about Tagalog but it sounds like a fun language.

Map courtesy of Ed Manley and James Cheshire.

About the Author

  • John Metcalfe
    John Metcalfe is CityLab’s Bay Area bureau chief, based in Oakland. His coverage focuses on climate change and the science of cities.