Beware the wiles of data scientists, and hail the naiveté of the ordinary businessperson …
As the proponents of big data analytics relentlessly beat the drum of "becoming data-driven," it's advisable to step back and ask two questions. One, are businesspeople sufficiently adept in understanding data in the context of making specific decisions, and in presenting data to management in a way that is useful and actionable? And secondly, should all decisions be automated on the basis of gathering "all data"?
In his book Risk Savvy: How to Make Good Decisions, Gerd Gigerenzer, managing director of the Max Planck Institute for Human Development in Berlin, explores risk measurement and decision-making issues in the context of the general public, but his thinking is certainly applicable in business too.
First, let's look at a couple of Gigerenzer's examples that show just how small a grasp most people have of statistical data, and how easily we can be manipulated by its incorrect or blatant misuse.
In the 12 months after 9/11, thousands of Americans abandoned air travel and took to long-distance driving, fearful of the possibility of another attack. Highway miles driven jumped by as much as 5%, as road deaths correspondingly increased in every month over the course of the year, above the average of the previous five years. Additionally, an estimated 1,600 people lost their lives as a result, compared to the 256 airline passengers and crew members included among those who died on 9/11.
In the emotional reaction to the trauma of the day, the American public completely lost sight of the statistically valid risk measurement that flying is substantially safer than driving.
Measuring risk the wrong way
In 1995, research findings issued by the U.K. Committee on Safety of Medicines warned that third-generation oral contraceptive pills doubled the risk of thrombosis. The press was mobilized to spread the word. Individual doctors and pharmacists passed the warnings on to women with predictable results: unwanted pregnancies and terminations soared. An estimated 13,000 additional abortions took place in England and Wales the following year.
Despite the scientific and medical training of the experts involved, they widely overlooked, or ignored, that the absolute risk had increased from one to two in 7,000 -- ironically, still much less than the risk of thrombosis associated with pregnancy and abortion. So there were two ways of describing the very same data: a relative increase in risk of 100% or an absolute increase of one in 7,000. The former makes for great front-page headlines and bite-sized advice. The latter might have taken a little more time to convey but could have averted much grief.
The above stories, and more in Gigerenzer's book, make compelling reading for anyone interested in how people interpret numerical data and use it (or don't use it) as a basis for making decisions. The truth is that very few of us, even those with scientific training, have been well-educated in this field. Thus, we lack the skills to differentiate between the different ways of expressing risk and uncertainty as well as the training to understand the nuances of the results presented to us. We can easily fall prey to our biases or preconceived notions about how the world is, or ought to be.
And as we move from "little data," where basic arithmetical training suffices, to the statistically described world of big data, the threat of misinterpretation grows proportionately. Self-service business intelligence, however desirable, doesn't easily scale to self-service business analytics. Business users -- and probably many data scientists -- will need significant up-skilling in the understanding and presentation of statistical data.
Biggest decision driver: Unconscious thought
Beyond the skills problem, there is a more fundamental concern, seen particularly in the first example about air travel post-9/11, and echoed in my choice of the term Business unIntelligence in my recent book of the same name. In our Western-oriented business thinking, intelligence is almost exclusively equated to rationality and conscious thought, especially in decision making. This ignores the reality of the brain and its mental processes, where perhaps 90% of what goes on happens unconsciously. Decisions, especially those with substantial personal implications or that require rapid response, are seldom data-driven.
Daniel Kahneman, a psychologist and winner of the Nobel Prize in economic sciences, deals with that topic in his book, Thinking Fast and Slow, but he falls into the rationalist trap that unconscious thought processes must be inferior to conscious ones. This leads to conclusions that we are deeply fallible and highly suggestible in decision making and must constantly be on conscious guard against our inner selves. Or, worse still, that Big Brother governments can and should "nudge" us toward decisions that are for our own good.
Self-awareness is, of course, important. However, to suggest that our small and recently developed frontal lobes can or should completely override the long-evolved but unconscious knowing of the majority of the brain is short-sighted in the extreme. This knowing -- in the form of gut feel, hunches, informed guesses and heuristics, where the vast majority of incoming data is ignored -- has a lot to offer in real-world decision making. We risk losing that as we focus exclusively on the collection and crunching of ever-increasing amounts of data.
In an uncertain world, one where some events cannot be foreseen, data-based probabilities can only take decision making so far. As the financial world discovered in 2008, an over-reliance on predictive risk models is disastrous when something outside the parameters of the model happens. Gigerenzer points out, "The problem is improper risk measurement: methods that wrongly assume known risks in a world of uncertainty. Because these calculations generate precise numbers for an uncertain risk, they produce an illusory certainty."
That is the danger inherent in becoming wholly data-driven or depending completely on analytics tools in decision making processes. The value brought by human decision makers is the ability to see context and an understanding of the business environment. These insights emerge not as fully explicable arguments but as strong hunches or intuitions. They're based on information, of course: On an old memory or an emergent pattern that the mind perceives. But most of all, they're based on patterns of thought processing that computer science is far from understanding, never mind emulating. And that, as they say, is a very good thing.
About the author
Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. His current interest is in the wider field of a fully integrated business, covering informational, operational and collaborative environments. He is the founder and principal of 9sight Consulting; email him at email@example.com.
Read Devlin's argument against the trendy term 'data lake'
Get a grasp on the reality of big data and business intelligence
Learn why proper management and governance are crucial with big data