As digital data expands, anonymity may become a mathematical impossibility.
In 1995, the European Union introduced privacy legislation that defined “personal data” as any information that could identify a person, directly or indirectly. The legislators were apparently thinking of things like documents with an identification number, and they wanted them protected just as if they carried your name.
Today, that definition encompasses far more information than those European legislators could ever have imagined—easily more than all the bits and bytes in the entire world when they wrote their law 18 years ago.
Here’s what happened. First, the amount of data created each year has grown exponentially: it reached 2.8 zettabytes in 2012, a number that’s as gigantic as it sounds, and will double again by 2015, according to the consultancy IDC. Of that, about three-quarters is generated by individuals as they create and move digital files. A typical American office worker produces 1.8 million megabytes of data each year. That is about 5,000 megabytes a day, including downloaded movies, Word files, e-mail, and the bits generated by computers as that information is moved along mobile networks or across the Internet....
The greater the amount of personal data that becomes available, the more informative the data gets. In fact, with enough data, it’s even possible to discover information about a person’s future. Last year Adam Sadilek, a University of Rochester researcher, and John Krumm, an engineer at Microsoft’s research lab, showed they could predicta person’s approximate location up to 80 weeks into the future, at an accuracy of above 80 percent. To get there, the pair mined what they described as a “massive data set” collecting 32,000 days of GPS readings taken from 307 people and 396 vehicles.
They then imagined the commercial applications, like ads that say “Need a haircut? In four days, you will be within 100 meters of a salon that will have a $5 special at that time.”
Sadilek and Krumm called their system “Far Out.” That’s a pretty good description of where personal data is taking us.