When open data is too open.

The map below should concern you. This visualization, made by James Siddle, shows a single commuter's journeys using London's public bicycles in a six-month period between 2012 and 2013. Purple lines indicate round trips while orange lines represent one-way journeys.

James Siddle

Even without an intimate knowledge of London’s geography, it is hard not to reach a few obvious conclusions. This commuter appears to live in the Limehouse neighborhood, at the southeast corner of the map, and works at King's Cross, toward the northwest. She probably has close friends, family, or a partner in Bow, at the eastern edge of the map. Control for time, and that theory gets stronger:

James Siddle

Those are journeys made between 4 a.m. and 10 a.m. They head in one direction: towards King’s Cross (in fact, to the only cycle docking station near the Guardian’s headquarters). And they come from two places, suggesting this person spends the night at a location that is not home.

Siddle says he had no desire to dig deeper, but a determined individual with just a little more information—a geocoded photograph, a tweet complaining about full docking stations—could probably identify this supposedly anonymous individual. "All that’s needed to work out who this profile belongs to is one bit of connecting information," writes Siddle on his blog.

When open data is too open

Siddle obtained this information through datasets made publicly available by Transport for London, the authority that controls all transport in the British capital. He says he was shocked when he downloaded the data in February. The documentation that accompanied it did not indicate that the data would include customer IDs (TfL says it has now plugged the hole).

"It's not something you should have in that dataset," Siddle says. "Because there is no direct way to tie it to people, it's kind of in a grey area. But because of the nature of the data, all it takes is a little other data to know who that person is. For prolific bike users that's their life."

Another cyclist's movements show just how rich Transport for London’s dataset is. (James Siddle)

An interactive version of some of his findings, which allow you to filter by time of day and number of journeys per route, shows just how revealing the information can be. Pick morning or afternoon for "random_profile_2" and you can see where the cyclist probably works and lives. Click on “evening” and you know where he socializes.

Urban authorities, countries and international agencies around the world routinely release datasets to the public in the hope that tinkerers such as Siddle will find creative ways to make use of it, and perhaps even help the service improve. In aggregate, such data are harmless. But as Quartz has reported several times over the past year, data linked to individuals can be used to draw detailed pictures of a person’s movements, connections, political beliefs and relationships.

Siddle says he alerted Transport for London before publishing his blog post but didn't hear back. TfL's general manager of cycle hire, Nick Aldworth, said:

We’re committed to improving transparency across all our services and publish a range of data for customers and stakeholders online. Due to an administrative error, anonymised user identification numbers were shown against individual trips made between 22 July 2012 and 2 February 2013. The data, which did not identify any individual customers online, was removed as soon as the matter was brought to our attention.

This post originally appeared on Quartz. More from our partner site:

About the Author

Most Popular

  1. Illustration of a house with separate activities taking place in different rooms.

    The Case for Rooms

    It’s time to end the tyranny of open-concept interior design.

  2. Car with Uber spray painted on it.

    The Dangerous Standoff Between Uber and Buenos Aires

    While Uber and Argentine officials argue over whether the company is an app or a transportation company, drivers suffer fines, violence, and instability.

  3. Life

    Having a Library or Cafe Down the Block Could Change Your Life

    Living close to public amenities—from parks to grocery stores—increases trust, decreases loneliness, and restores faith in local government.

  4. Four scooters that say "Available on Uber."

    The California Legislature Is Getting Played by Micromobility Companies

    If the California legislature passes AB 1112, cities can’t require companies like Bird, Lime, and Jump to limit numbers, meet equity goals, or fully share data.

  5. Videos

    A Glimpse of an Unbuilt ‘Pei Plan’

    The late architect and planner had some very big ideas for Oklahoma City in the 1960s. But the final result wasn’t exactly what he had in mind.