The other day, I was doing research on invisibility for an upcoming novel. As improbable as it may seem, I found more information than I needed in a matter of seconds — about the same amount of time it must have taken the Boston bombers to learn how to convert a pressure cooker into a lethal weapon. The web is not only a superhighway that brings us limitless information, its sheer volume is also altering the way we look at the world. (“The Rise of Big Data” by Kenneth Cukier and Victor Mayer-Schoenberger, Foreign Affairs, May/June, 2013 pg. 29.)
In the past, researchers made predictions based on a few targeted samples. Exit polls during elections are an example and the statistics derived are always accompanied by a margin of error to account for variables. In this manner, researchers identified a link between high cholesterol and heart problems.
Today, the world wide web makes it possible to collect so much data, “big data,” or “data in the wild” that correlations can be discovered without the use of targeted samples. (Ibid, pg. 31) In 2009 Google published a paper in Nature based on 50 million search hits on the subject of influenza. They mined this information between 2003 and 2008 without concern for what motivated the searches. A student might have been doing a school paper or someone might have been worried about his or her symptoms. It didn’t matter.
Once the Google folks obtained their data, they compared their results with the data from the Centers for Disease Control and Prevention. What they found was a relationship between flu epidemics and the volume of influenza queries. (Ibid pg. 33).
Google’s correlation may seem intuitive, but not all correlations are. New York City, for example, did data mining on fires and applications for building permits. Without worrying about the cause behind the correlations, they found that “buildings obtaining permits for exterior brickwork correlated with lower risks of severe fire.” (Ibid. pg 36) The City used this information to set priorities for the way they deployed their limited number of safety inspectors. The result was a reduction in the number of fires.
Of course, collecting “big data” has its dark side. In the film, Minority Report citizens were arrested on the probability that they would commit a crime. Tom Cruise saved humanity by beating probability. Nonetheless, this shift in our thinking from one of searching for facts, which can be a painstaking and expensive, to one of searching for probabilities, will not only be cost effective but may uncover some startling relationships. What it will do for dating sites, I can only guess.
(Courtesy of gizmolord.com)