This time, based on Twitter data.
The map above is the latest play on the old "pop" vs. "soda" map of the United States. Edwin Chen, a data scientist at Twitter who conducted math and linguistics research at MIT, compiled tweets that used either "coke," "soda," or "pop" when describing a soft drink. The red on the map shows places where "coke" is more prevalently tweeted, green indicates "pop," and blue is for "soda." Chen describes his method for organizing, cleaning, and aggregating the data:
Here, I bucketed all tweets within a 0.333 latitude/longitude radius, calculated the term distribution within each bucket, and colored each bucket with the word furthest from its overall mean. I also sized each point according to the (log-transformed) number of tweets in the bucket.
Chen's findings reflect similar linguistic boundaries seen in prior maps, such as this one by Samuel Arbesman. The East Coast and much of the West (California) are blue, indicating a prevalence of the term "soda." The Midwest tends to use "pop." And the South (and many spaces in between) prefers "coke."