Here is a small article on Big Data I wrote as the opening shot in the Business Technology supplement published yesterday in the Sunday Telegraph.
Big Data is certainly a big buzzword, but there are those out there who say Big Data is nothing really new. As a rule I find these people have careers based on what we can now call small data (or perhaps that should be Small Data). Big Data certainly is something new, and there are two reasons why it is aptly named.
First, Big Data is really big. It is not just a bit larger than the data we had before, nor is it just lots more of small data. Big Data is defined by the fact that it is so large, it cannot be handled by the tools or techniques conventionally associated with data analysis (one of the reasons its rubs small data people up the wrong way) and this also means we can use it to do things which were not possible when all we had was small data.
Data is now big because our ability to capture and store it has exploded. We now have sensors in everything, from cars to kettles, and while in the past this was used for real-time diagnostics, we can now upload and store this information, a process that often happens automatically as objects themselves become connected to the internet: the so-called Internet of Things. Social networking tools have also encouraged individuals to generate vast amounts of information about themselves: Facebook recently revealed that it sits on top a data well that gushes 500 terrabytes of data every day. This has created a perfect environment for the growth of the algorithm (The Algorithm). In fact, when we talk about Big Data we are really talking about algorithms, which are the tools that feed on data in order to produce useful information. Algorithms don’t worry about being swamped by data, in the way that small data analysts do. As far as they are concerned, the more data they can crunch the better.
This brings us to the other reason why Big Data is such an apt name. Explain to anyone what it is that algorithms can do in a world of Big Data and what inevitably springs to mind is George Orwell, 1984 and Big Brother. The alliterative association between Big Brother and Big Data is no coincidence. Back in March I was talking at an event organised by the EU Council and advanced the claim to the assembled Eurocrats that the algorithm is the most powerful instrument for social control invented since the sword and that currently governments and society have no methods available to control what algorithms can get up to. Algorithms are masters of using data in ways for which it was not originally intended. What you thought of as inconsequential tweets or Facebook updates could be used to refuse you a mortgage, set your insurance premium or decide if you are a potential criminal. The hunger of the algorithm for data means that no data is inconsequential. In the world of Big Data all data has consequences and algorithms also eat for lunch all of the methods we currently have in place to provide data protection and privacy: something I also pointed out to the Euro regulators. In a world where even a kettle can spy on your life, the right to anonymity is simply not sustainable: you may as well log-out from life (and coffee) itself.
To end on a slightly lighter note, consider the sage words of @BigDataBorat (Learnings of Big Data for Make Nation of Kazakhstan #1 Leading Data Scientist Nation) who tweeted “When you monitor database is #SmallData. When database monitor you is #BigData”.