A data scientist visualizes the sprawling site so you can find that weird, niche-interest group you never even knew to look for.
Randy Olson was a “complete amateur” when he began haunting the data visualization corners of the social networking and content-sharing site reddit.
“I started listing my [visualizations] in these subreddits,” he says, and quickly began getting feedback. “That’s how reddit is really useful. Techies, professionals are on there who will help you as a community to help build up your things.”
That was a few years ago. This month, Olson graduated from Michigan State University with a Ph.D. in computer science; now he’s a postdoctoral researcher at the University of Pennsylvania, where he’s studying artificial intelligence. And he’s much, much better at data visualization. Olson is, in fact, a community leader at the data visualization subreddit /r/DataIsBeautiful. He’s also created this beautiful, sprawling map of the entire reddit universe:
In the map above, interests with many related subreddits are in the deepest red, while those on the fringe are closer to blue. Each subreddit’s proximity to others is determined by the activity of over 850,000 active reddit users.
Why do we a need a map of place that doesn’t actually exist? As Olson explains in a paper published this week in the journal PeerJ Computer Science, most social networking sites are organized haphazardly, their architecture built by enthusiastic users with niche interests that differ, perhaps, from those of the Japanese organizing maven Marie Kondo. It can be difficult, then, for other enthusiasts to find them. On reddit, many users start at the site’s famous front page—but don’t get much further than that.
To guide users beyond their well-tread web paths, Olson and his co-author, Zachary P. Neal, scrape real time data from reddit to create a constantly updating, visualized network of the site. (You can play with it here). Click on one niche—let’s take the subreddit “r/transit”—and the interactive map will lead you to other subreddits on the topic. If you like r/transit, you’ll love r/TrainPorn.
Let’s get a little more whimsical. Here’s the My Little Pony cluster:
And the folks who are into politics (and also snacks):
Olson says he’s experimenting with bringing the same approach to other social networking hubs, like Facebook and Twitter. Those sites already have a similar “tagging” architecture that should make for some relatively simple mapping—if the companies would be willing to hand over their data. “There is a sort of hidden structure in these sites that this [mapping] brings to light,” Olson says.
A tip from an expert: “If you have an interest, no matter how dark it is, there’s a subreddit for it,” Olson says. Now it’s a little easier to find.