Fooled By Data Visualisation

Please take a look at the first graph below. It could represent anything – sales by week number, a stock price, the temperature outside or a declining market share percentage. If a graph like this appears in your business then remedial action would almost certainly be taken. But here is the rub. This graph was produced in Excel by simulating 20 fair coin tosses and counting how many heads appeared in that 20 tosses. This process was then repeated 30 times and each point on the graph represents the number of heads from one set of 20 tosses. It is random.


Imagine you have 20 customers who on average each place an order once every two weeks. This graph could quite easily represent the number of orders taken each week for the first 17 weeks of the year. Alarm bells would be ringing, heads would roll and  a hero would be born. Why a hero? Because things always tend to revert to the mean and it is fairly unlikely that three heads out of twenty coin tosses will repeat itself. And so our new Sales Director hero will be the unwitting recipient of a return to the mean – give her or him a pay rise.

The diagram below shows how this random sequence (which should average 10) progresses. Notice how the heroic efforts of the new head of sales immediately recovers the situation – not.


Blindness to the random is one of the greatest dangers when using data visualisation tools. We will find trends when none really exist, and we will find other patterns (cyclical behaviour for example) when none really exists. We are hard wired to find patterns, and until recently it has served us quite well. But now we have to deal with probabilities, because in an uncertain world this is the only tool we really have. Our brains however have a hard time dealing with the uncertain. If you want more on this read Thinking Fast and Slow by Daniel Kahneman – a Nobel prize winning economist.

So let’s be categorical. Many of the patterns you will find using the latest generation of eye-candy rich visualisation tools will be nothing more than random accidents with no meaning whatsoever. It is highly likely that your business will suffer because of this willingness to see patterns where none exist. In fact the whole thing is becoming slightly silly with phrases such as ‘the two second advantage’. Implying that data visualisation will speed up recognition of patterns to such an extent that within two second you will be able to make a decision. I hope your résumé is up to date.

The solution to all of this is nothing to do with visualisation, but depends on a modest understanding of randomness and probability and above all understanding your business – but of course no one is selling an expensive, career enhancing software package to do these unappealing things.

There is an excellent example of big decisions being made in business that are based on nothing more than random variation in Butler_dicewthe book The Drunkards Walk, by Leaonard Mlodinow. A Hollywood film boss was sacked because of the diminishing success of the movies the company produced. The thing with movies is that they take years to get funded and produce, and so the pipeline this sacked boss had in place continued for several years after he was gone. It was hugely successful and he was sacked for nothing other than random variation. And everyone has heard of the experiment with monkeys throwing darts at the financial pages to generate an investment portfolio. It outperformed most of the professionally managed funds.

There is a lot of ego at stake here, and so data visualisation can certainly be used to glorify a rising trend and cast into hell those responsible for a falling trend – which of course is exactly how it will be used.

