In a datafied world, algorithms become the genes of society

Here is an interesting and slightly scary thought.  What is currently going on (in the world of Big Data) is a process of datafication (as distinct from digitisation).  The secret to using Big Data is first constructing  a datafied map of the world you operate within.  A datafied map is a bit like a geological map, in that it is comprised of many layers, each one of which is a relevant dataset.  Algorithms are what you then use to create the connections between the layers of this map and thus understand, or shape, the topography of your world.  (This is basically Big Data in a nutshell).

In this respect, algorithms are a bit like genes.  They are the little, hidden bits of code  which none-the-less play a fundamental role in shaping the overall organism – be that organism ‘brand world’, ‘consumer world’, ‘citizen world’ or ‘The Actual World’ (i.e. society) – whatever world it is that has been datafied in the first place.  This is slightly scary, given that we are engaged in a sort of reverse human genome project at the moment: instead of trying to discover and expose these algorithmic genes and highlight their effects, the people making them are doing their best to hide them and cover their traces.  I have a theory that none of the people who really understand Big Data are actually talking about it – because if they did they are afraid someone will tell them to stop.  The only people giving the presentations on Big Data at the moment are small data people sensing a Big Business Opportunity.

But what gets more scary is if you marry this analogy (OK, it is only an analogy) to the work of Richard Dawkins.  It would be a secular marriage obviously.  Dawkins’ most important work in the field of evolutionary biology was defining the concept of the selfish gene.  This idea proposed (in fact proved I believe) that Darwin (or Darwinism) was not quite right in focusing on the concept of survival of the fittest, in that the real battle for survival was not really occuring between individual organisms, but between the genes contained within those organisms.  The fate of the organism was largely a secondary consequence of this conflict.

Apply this idea to a datafied society and you end up in a place where everything that happens in our world becomes a secondary consequence of a hidden struggle for survival between algorithms.  Cue Hollywood movie.

On a more immediate / practical level, this is a further reason why the exposure of algorithms and transparency must become a critical component of any regulatory framework for the world of Big Data (the world of the algorithm).



  1. Pingback: Cambridge Analytica, Facebook and data: was it illegal, does that matter? - Richard StacyRichard Stacy
  2. Marius

    The assumption that “the fate of the organism was largely a secondary consequence of this conflict” is too reductionist. You are suggesting biological determinism. See “Human behavioural biology” Stanford lectures on Youtube about this.

    The same thing with algorithms. An algorithm in isolation can be a complicated thing. An algorithm as a component of a subsystem of a larger system is a complex thing. If you study the algo and make it transparent it doesn’t mean you will make the system better.

    In that sense, even if the algorithms are genes, you cannot understand the system by studying the individual algorithms. You need to see what are the mechanisms through which the algos interact, how they are rewarded, etc. Even if you make the algos public and transparent you might not get anywhere. Systems theory suggests that well intended agents in a system with the wrong incentives will end up behaving maliciously. It’s not that algos are complicated, it’s that their interaction is complex. It’s good to understand the individual components, but better to focus on the incentive system.

    • RichardStacy

      The point I was trying to make is that we are, to a large extent, a product of our genes and the more we understand about genetics, (for understanding you could read making transparent) the more we are appreciating the extent of this. My understanding is also that the point that Richard Dawkins was making in ‘The Selfish Gene’ is that genes exist within a collaborative system that is an organism, but the genes themselves are non-the-less motivated to behave in a selfish way. And that this could be a good way of thinking about algorithms so that people start to recognise their potential significance and the importance of ensuring that algorithms and the systems within which they operate, are exposed to public scrutiny.

      • Marius

        I agree with your point. Both approaches (the focus on the system as a whole and the on the algorithms individually) yield results. If anything goes wrong trying to fix the problem by using one approach only can result in fixing the symptom and not the cause.

        Nevetheless, we’re arguing over details.

        Algos that serve or will serve as public goods or institutions need transparency and scrutiny. That’s, for example, the case of encryption algos. Unless it’s open source and peer reviewed, nobody will use one. This logic should be applied across everything.

        The problem lies when an algo like this is just a neural network and what matters really is the data used to train it. I doubt that sort of data will ever be exposed.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>