Big Data: gold mine or fool’s gold?

(This was published in the print edition of Digital Age in Turkey earlier this month.  It also appeared as few days later as a Digital Age blog post – if you want to read it in Turkish!)

There is a lot of buzz about the concept of Big Data.  But it is really the potential gold mine that some are suggesting?

Back in July I was at the Marketing Week Live show in London participating in an event organised by IBM.  We were looking at data and consumer relationships within fashion retailing, using high-end women’s shoes as the example.  The big issue fashion retailers face is that everyone walking into a store is a stranger.  The sales assistants know nothing about them, other than what they can deduce from their appearance and any conversation they can then strike-up.  We therefore asked ourselves the question: how might it be possible to use data from the digital environment so that potential customers were no longer strangers?  How might we be able to create a digital relationship so that when a potential consumer walks through the door the sales assistant would be able call-up this relationship history and pull this on-line contact into an off-line sales conversation?  One of the IBM analysts put it thus, “we need to be able to identify the exact moment a potential consumer starts to think about buying a new pair of shoes, identified from conversations they have with their friends in social networks and be able to then join those conversations”.

Welcome to the world of Big Data.  In the world of Big Data it is theoretically possible to know as much about your consumers as they know about themselves: to be able to anticipate their every thought and desire and be there with an appropriate product or response.  It is a world of ultimate targeting and profiling and this world is tantalisingly within reach because of the huge amount of real-time, personal information consumers are giving away about themselves via their usage of social media tools.  Facebook is currently sitting on top of a data pipeline that is pumping 500 terabytes of behavioural data every single day.

The ‘actionability’ problem

But this brings us to the first of the big problems with Big Data: what the IBM people called ‘actionability’.  What can we actually do with this vast quantity of information?  How do we process it and sift out all of the possible moments when an individual becomes a potential consumer and when we have done this, what do we do next?

A large part of the practice of marketing to date has been about targeting, especially in relation to media and channel planning and the practice of Customer Relationship Management (CRM).  The problem is that no matter how good we think we have become at this, we haven’t actually been that good.  Even the most sophisticated CRM programmes have only got us to the point where we can find approximately the right time and place to put a slightly personalised advertising message in front of a potential consumer.  And as a result, even if we can now use Big Data to find exactly the right time, to talk to exactly the right person with exactly the right message – organisations are just not set-up to know how to handle this situation appropriately.

The world of social media is the world of the individual, whereas the world of traditional marketing is the world of the audience.  We have become very good at speaking to audiences with single messages, but we have no experience as to how to talk to individuals, where our behaviour has to be social and the information we give has to be highly specific and relevant and where we also have to recognise that the task is not simply to target conversations, but to create the permission to enter a conversation.

In reality, there are very few consumer conversations that it is possible for a brand to enter.  A consumer may be having a conversation about shoes with her friends, but this doesn’t mean that she has given permission to have this conversation interrupted by a shoe manufacturer or retailer.  So no matter how precisely we may be able to use Big Data to identify the conversations we would like to join, it is likely that the majority of our attempts to enter these conversations and strike-up relationships will be rejected.

The permission problem

So ‘actionability’ is the first big problem with Big Data – relevancy of response and the creation of permission to enter a conversation or create a relationship.  And this question of permission brings us onto the next problem.  As we have seen, brands need to create permission from consumers to enter their conversations – but brands also need to create permission to have the data about these conversations in the first instance.  It is very easy for a consumer to see the collection of Big Data, once they know it is going on, as a form of digital spying.  In fact it is very hard to portray Big Data as anything else.  Its’ very name betrays its ultimate purpose in that it references the concept of Big Brother – the all-seeing, all-powerful, sinister force at the heart of George Orwell’s disturbing futuristic novel, Nineteen Eighty Four (as well of course, as the rather more light-hearted usage in the all-seeing reality TV series).

At the moment most users of tools such as Facebook and Twitter believe that they have some control over the sharing of their data, via the usage of various forms of privacy and access control.  They do not fully appreciate that Facebook, Twitter and Google themselves can see, and thus use, all of their data.  And perhaps most importantly they have no idea of the implications of what happens to this data once it falls into the hands of the Wizard of the World of Big Data – the Algorithm.

It is easy to imagine that the information we share on Facebook or Twitter is inconsequential – trivial stuff about ourselves which we don’t have a problem with other people seeing.  However, to an algorithm, such data is far from trivial.  Algorithms love large, broad-spectrum behavioural data sets, and of course there is no larger such set than that owned by Facebook.  Algorithms work by establishing patterns of behaviour and then trawling through oceans of data to find occurrences of similar patterns.  Thus if we know the digital pattern associated with a particular type of person or behaviour we want to identify, algorithms can pull these people out from the crowd – even if the data they mine to do this has nothing to do with the characteristics we wish to identify.  And they do this invisibly; the person who has been selected has no knowledge that they have been identified and also no way of making a connection between any action that happens as a result and the behaviour (or data) that triggered this.  To put it another way, we can easily design an algorithm to identify people who might, for example, be good (or bad) credit risks even if there is nothing within the data being mined that has anything to do with financial behaviour, money or attitude to risk.  Or to put it yet another way, algorithms can determine whether a bank will give you a loan based on tweets about (amongst other things) what you have for lunch.

This is where things really start to get interesting with Big Data.  As we have seen, Big Data may be limited in its ability to give us opportunities to create effective relationships with consumers.  However, it can be very useful in its ability to put people into very specific categories.  These may be categories of people we want to deal with (such as people with a higher propensity to buy our product) or they may be people we don’t want to deal with (such as people likely to default on a loan).

The ability to do this provides a tremendously powerful tool to brands.  But the problem is that this behaviour fundamentally violates one of the core principles of data protection and privacy – which is that people must be able to give informed consent to making their personal data available and also that this data be used only for a specific purpose, i.e. people need to be able know precisely how their data will be used before they give consent for it to be shared.  Algorithms, however, are masters of the use of data for ways in which it was not originally intended – using tweets about eating habits to determine credit worthiness for example.

At the moment, this isn’t a problem.  Facebook users don’t understand algorithms and they have no idea as to the consequences of sharing their inconsequential data.  But the algorithms cannot remain consistently invisible; they leave traces of their activities through the actions they then generate and the more frequent are these actions, the more likely it is that consumers will start to question just how it is that brands seem to know so much about them.  And once they start finding out the answers to this question, their reaction is unlikely to be positive.  Their reaction is far more likely to be a demand for brands to stop spying on them and a demand that their social networks stop selling their data.   So even in this area of Big Data, where the opportunities are already obvious and available, one would have to question just how sustainable these opportunities are going to be, in the long-term.

So, despite all the hype and promise around Big Data, when you start to strip this concept back to its core, some significant problems are revealed.  These are the difficulties in using this data in a way which is relevant and acceptable (and thus ‘actionable’) combined with the lack of permission to have the data in the first place, which may ultimately severely restrict the amount of data brands will have at their disposal as well as placing tight constraints on how they can use it.  As the saying goes “All that glitters is not gold”.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>