What is big data?
Money doesn’t make the world go around, data does
But what is Big Data? The definition, from Gartner, the American research and advisory firm is:
Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.
Essentially, a vague term gets a vague definition. So let’s apply a little focus.
Everything you do has data associated with it. The problem is that the quantity of that data is so vast that in terms of gaining useful information, you often can’t see the wood for the trees.
Our capability to digest data has increased by orders of magnitude. It would take an original Bell 103 modem from 1962 over 18 hours to deliver what a modern internet connection can do in just one second.
Everything everyone else does has data associated with it too. However, with this data, as well as the volume issue there is also a variety issue. Ie. the data may not be compatible, it may also be difficult to work out exactly how the two data sets correlate. (Or even if there is a correlation.)
So why is it useful?
This incomprehensible volume of data, whirling constantly around us at blinding velocity, from an innumerable variety of sources is at first glance confounding. But for many businesses it’s the answer to a question first asked by Douglas Adams nearly forty years ago:
What is the secret to life, the universe, and everything?
The answer isn’t 42, its Big Data.
Big Data is digital Zen, everything is connected. Want to know why some supermarkets put beer next to nappies? Big Data. Why is your keyboard not alphabetical? Big Data. How do we know e-mails are more likely to be opened on a Tuesday morning? Big Data. Want to know why you need a ‘black box’ in your car to reduce insurance? You guessed it… Big Data. The list goes on. (And on, and on…)
Just how ‘big’ does Big Data have to be?
It really depends on what you want to find out. If you’re not sure, then lots. But if that’s the case you need to be very careful about the suppositions you make and the correlations you draw, as you can see in Tyler Vigen’s fabulous work on spurious correlations. However, there is an increasing movement away from Big Data, towards Lean Data. Lean Data is the complete antithesis to Big Data, in that you ask the question first, then gather the data. This approach allows you to be much more focused in modelling your data as well as in getting a more informative, and less biased answer from it.
In my opinion it boils down to the truism:
‘work smarter, not harder.’
How does it work?
As I just alluded to, there are two main methodologies for utilising big data:
- Keep all data and compare it all to see what correlates.
- Ask appropriate questions, then gather and contrast the data you think is relevant.
Option one works fine if you have lots of time and money and know that there is lots of stuff that you don’t know and that will benefit you. But if you have constraints, or if there are just a few things where you would like to know just how closely they are linked together then the second option, Lean Data may work better than Big Data.
The actual analysis of the data is a little outside what I wanted to cover here, but there are a great sequence of articles on how to analyse data on the ‘Skills you Need’ website ranging from beginner to advanced.
Why should I get involved?
Data is one of the world’s most valuable commodities. But the value of data is subjective. IE. information that I find valuable may not have value to anyone else. If you have data, and it is of value, even if it’s only valuable to yourself, then you should at least look into what it can do for you, the reward you will get from it, and the effort you are going to need to put into obtaining it in a usable form.
How can Big Data bring commercial advantage?
Even if you can’t get value from the data you have access to, it’s also worth thinking about how you can contribute to someone else’s data. What data is associated with the products and services that you provide, and what value might it have to your customers? How could you enhance your products and services so that they capture that data, and how can you commercialise it?
Will the ROI be worthwhile?
Without knowing more about what you want to achieve it’s difficult to say, but Big Data innovations can be surprisingly low cost, the keys, as always, are a tight, unwavering specification and control over deadlines, particularly in regard to testing and roll out.