Have you ever attended a rock concert and wondered why the band had so many guitar players on stage? Is it really necessary to have three people playing electric guitars and a fourth with an acoustic? Perhaps. But it also might be that the band was simply trying to blend the playing of many musicians together so that the collective could cover up and compensate for the mistakes of individuals.
Businesses are starting to take a similar approach to predictive modeling. Known as ensemble modeling, the process involves developing several analytical models that look at different factors or weight common ones in different ways. The results of the various models are then fed into a new model, which aggregates the findings to generate a single analysis score. The idea is that no one model is likely to be perfect but that incorporating different models into analytics applications compensates for any imperfections in the individual ones.
"You should always consider ensembles if they're an option for you," John Ainsworth, a senior data scientist at advanced analytics consulting firm Elder Research Inc., said in a presentation at the 2014 Predictive Analytics World conference in Boston.
Ainsworth described a project he worked on for wireless phone service company nTelos Inc., a regional carrier that operates primarily in Virginia and West Virginia. The goal was to predict which customers would leave the Waynesboro, Va. company for other service providers. He knew that factors like customer service complaints, payment history, call quality and even general demographics could help predict customer churn. But he didn't know which of those factors would be the most predictive.
The solution was to build an ensemble model -- basically, a collection of separate decision tree models that factored in the different data elements. Combined with an initiative to intervene and better serve customers who were deemed likely to leave, the modeling effort helped nTelos achieve an overall 6% reduction in churn, Ainsworth said.
An idea whose time has come?
The idea behind ensemble modeling isn't entirely new: Examples have been described going back to at least 1995. But early on, the approach was mainly an academic pursuit. Now more businesses have the computational power and statistical modeling software necessary to run ensemble models at the kind of scale and speed that make the approach viable. Most advanced analytics software available today will support some form of ensemble modeling, conference attendees said.
"This used to be so complex to do, but over the last decade you see a lot more examples," said Viswanath Srikanth, a program manager at Cisco, who presented research at the conference on predicting the effect of promotions on attendance at major league baseball games.
Srikanth said using ensemble models allowed him to accurately model a number of factors, including the effect of things like fireworks displays and bobblehead and t-shirt giveaways on attendance at all major league parks for the entire season. Taking all those factors into consideration, and accurately predicting their effects, would have been impossible in a single model, he said.
"These decisions are so complex and no one model gets all the factors right the first time," said Madhav Chinta, director of data science product development at virtualization and mobile software company Citrix Systems Inc.
Chinta uses a cloud-based machine learning platform from analytics vendor Wise.io to run ensemble models that predict which customers will leave Citrix for competitors. He employs a common type of ensemble model called a random forest model, which collects the outputs of several decision tree models and produces a consensus set of findings. Since there are so many factors that go into customers' decisions to take their business elsewhere and modeling them is so complex, the ensemble approach is ideal, Chinta said.
Don't make it too complicated
One of the main problems with ensemble models is that what they deliver in increased accuracy may get lost in translation. Ainsworth said the process is complex and can be difficult to explain to less analytics-savvy business executives, who are still the ones that ultimately have to make decisions based on the findings. As a result, he tries to limit the number of different models in an ensemble to four. He said that allows him to trace a model's recommendation for anyone who's interested and show how the model arrived at its recommendation.
Also, modeling projects need to start with a specific business problem in mind. Otherwise, they can devolve into academic data exploration exercises, said Chinta. All of his analytics initiatives involve the business side from the start, and he builds a visual dashboard in Tableau for business users to explore the results.
"From a business perspective," he said, "if you only get the score, the question becomes, so what?"
Predictive modeling is more than a math problem
How uplift modeling helped re-elect Obama and can help marketers
Get business buy-in for all predictive modeling projects