There is an increasingly strong ethical dimension to technology design and use. Now that technical advancements have eased so many barriers for users, the question of how to accomplish something with technology is being replaced by this: “What do we do with all that power?” One area that’s particularly susceptible to ethical issues is analytics.
Let’s explore some examples. Imagine you are an insurance company analyst, and you read that sitting in front of a computer too long can lead to neck and back problems. You've also been part of initiating a very successful customer community on your company's website. After registering on the community site, customers can discuss and rate medical specialists or peruse a large collection of information on health and diet best practices.
In a moment of creativity, you decide to write a software program to track which customers spend the most time on the website, and you correlate that information with the claims data in the company’s data warehouse. Indeed, you find a correlation: Those who spend a lot of time on the site are clearly at higher risk for back and neck problems, which should be reflected in their insurance premiums.
Is it ethical to mine for this correlation in the data? It's data, it can be analyzed, and doing so leads to more knowledge. What can be wrong with that? But I would guess that there is a good chance you'd view this analysis as wrong to do.
Consider another example at the same insurance company. You post a survey on the community site as part of a preventive health program. You ask people questions about time spent on their computers and any neck and back problems they may have. As a thank you for participating in the survey, after the data is analyzed customers who indicate they have neck and back problems receive a free software application that reminds them after 45 minutes of online time that they should do stretching exercises.
Would this be ethical? My guess is that most people would intuitively feel there are no big issues with this initiative. How is this different from the previous example?
Lack of consent raises red flags
One difference is immediately obvious. In the first example there is no consent, but there is in the second example. In the first, customers visit and use the community website for various purposes, but they aren’t told that data about their activities will be used by the company for other purposes. In the second, people consent by willingly completing a questionnaire. It is the lack of consent to use data for a specific analytical purpose that may be an ethical issue.
Now consider a third example. Mining the data warehouse, you find that within two months of upgrading their dental insurance, a sizable percentage of customers claim dental expenses that weren’t covered before the upgrade. You recommend that the policy terms and conditions be changed so that reimbursement of newly covered expenses starts after four or six months.
In this case, there is no consent, but I have found that not many people would see ethical issues in what was done. What is different? In this example, customers know that dental work is needed and intentionally upgrade their insurance so the work will be covered. The insurer is using analytics to protect its legitimate interests. Next to consent, intent plays an important role in considerations of analytics ethics.
What makes the first example problematic is not only the lack of consent but also the insurance company’s intent. Customers may face higher premiums because the tracking of their activities on the community website marked them as a higher risk. It’s a trap set by the company and clearly not ethical. In the second case, not only is there consent, but the insurer’s intent is different: to help customers avoid health problems.
Using data to plan speed traps: Not so fast
Here’s a real-world example. In April 2011, European newspapers reported that police in the Netherlands were using data collected from TomTom Global Positioning System navigation devices to plan speed traps. There was a public outcry, and TomTom quickly responded by changing its contract language to prevent the data it sells to governments from being used for such purposes. But how did this happen?
One of TomTom’s innovations was to make its navigation devices bidirectional. It collects driver data in real-time and uses the information to notify subscribers of traffic jams. To maximize the profit of this service and to price it competitively, TomTom states in the terms and conditions for its service that it also is allowed to sell the collected data in an aggregated and anonymous form.
The authorities in charge of highways and roads in various locales have found good uses for the TomTom data. It helps them see where road improvements are needed to eliminate recurring traffic jams or minimize the ones caused by ongoing roadwork. So far, no problem. But then the data landed in the hands of Dutch police departments that used it to calculate average driving speeds and plan the placement of cameras to catch speeders.
Was the use of the data appropriate or not? It can be argued both ways.
The data was legally bought and contained no identifiable information, so citizens did not have to give consent for the police to use it. Furthermore, the data wasn’t used to find and punish speeders after the fact – it was used to catch people at the actual moment of speeding. In fact, this type of data-based decision making is an example of more efficient use of taxpayers’ money. It replaces a more elaborate process of physically searching for places where drivers tend to speed. Why can’t the police use technology to improve the effectiveness and efficiency of speed traps if citizens can use technologies such as Twitter to avoid such traps?
Yet TomTom’s immediate reaction was to stop the practices. Faced with negative feedback from customers, the company decided that the police use of its data was bad for business. Customers pay extra for premium services such as dynamic traffic-jam monitoring, and they enable those services by supplying the required data. They are supposed to benefit from that, not be punished as a result of it.
Divergent views muddy ethical waters
In ethics, there are two main schools of thought. There are the consequentialists, who feel actions can be judged based on their outcome. If the outcome is good, the action was good. If the outcome is bad, the action was bad. The universalists, on the contrary, feel that there should be some rules up front. There are simply things that you should or shouldn’t do because you believe they are right or wrong.
Both approaches have their limitations for assessing the ethics of analytics. It will be hard for consequentialists to maintain that you can freely explore everything and just ignore certain new insights. You can’t undo knowledge, and you can be held responsible for not using information as much as for using it.
In fact, there is a new rule emerging from the examples I detailed: The more a certain use of data is removed from the original goal and the original measurement instrument, the bigger the chance that issues will arise. The insurer’s community website can be used for research but wasn’t meant for tracking length of use. TomTom data can be used for analyzing the effects of roadwork, but people may object to using it to plan speed traps. In both cases, the problematic uses were one step too far removed from the original purpose.
Yet, the universalists shouldn’t cheer so soon. Even with my new rule, their position is hard to maintain as well. Analytics today are interactive and iterative. Analyzing data is not just about answering questions; it’s explorative in nature. When you begin, you don’t necessarily know where your exercise will end. Also, modern data mining tools crawl through data automatically and answer questions that weren’t even asked!
There are no easy answers to ethical dilemmas. With so much development going on in analytics technology, and analytics having so much impact on business models and strategies, we need to have a debate – in businesses and in public – on what is the right thing to do. I hope this debate comes before any damage is done. But I am afraid a more likely prediction will come true first: Some large enterprises will suffer major public-relations damage by making mistakes in their analytics programs and upsetting the general public. It is not inconceivable that regulators will step in and restrict the use of data for analytics. It is not impossible that some businesses will even have to fold after failing to recover from legal actions filed against them.
I’ll end with a consequentialist view. Something good may come out of this analytics conundrum: new analytics best practices that protect not only companies but also their customers.
About the author:
Frank Buytendijk's professional background in strategy, performance management and organizational behavior gives him a strong perspective across many domains in business and IT. He is an entertaining speaker at conferences all over the world, and was recently called an “intellectual provocateur” and described as “having an unusual warm tone of voice.” His work is frequently labeled as provocative, deep, truly original, and out of the box. More down to earth, his daughter once described it as “My daddy sits in airplanes, stands on stages, and tells jokes.” Frank is a former Gartner Research VP, and a seasoned IT executive. Frank is also a visiting fellow at Cranfield University School of Management, and author of various books, including Performance Leadership (McGraw-Hill, September 2008), and Dealing with Dilemmas (Wiley & Sons, August 2010). Frank's newest book, Socrates Reloaded, is now available and is highly recommended. Click here for more information on how to get your copy today.