carloscastilla - Fotolia
CAMBRIDGE, Mass. -- When looking at machine learning for data science, the important question to ask of the data is the same one 2-year-olds persistently ask their parents: Why?
Although a simple question, it is not asked or answered often enough as industry goes full-tilt toward machine learning and artificial intelligence (AI), according to Milind Kamkolkar, chief data officer at French pharmaceutical company Sanofi, speaking at last week's MIT Chief Data Officer and Information Quality Symposium.
"There's a lot of stuff missing in data science today," Kamkolkar said, suggesting one of the main things missing in machine learning for data science is communication. Teams must be able to convey what predictive results mean and why they matter, he said.
Now, more than ever, data analytics groups must get closer to the users of their products, Kamkolkar told attendees at the event's session on machine learning and advanced analytics.
"If you haven't been able to effectively communicate what the meaning of the insight is, it's almost like having run a 100-meter sprint, and then falling down at the last 9 meters," Kamkolkar said in an interview at the symposium. One way to improve communications is to include capable storytellers in teams, he said, urging CDOs to "hire a journalist" into their data science teams.
Every model tells a story
Machine learning needs such data journalists, or storytellers, on the data science team, agreed Nikhil Aggarwal, an experienced manager in financial fraud analytics in banking who is currently entrepreneur in residence for the iValley Innovation Center in San Ramon, Calif.
Nikhil Aggarwalentrepreneur in residence, iValley Innovation Center
"We need storytellers that can simplify the takeaways, so we can put them into action," Aggarwal told attendees at the symposium. They must also be able to recommend actions that decision-makers can take based on predictive analytics, he added.
Chief data officers should keep in mind that success is less about quantitative models and sophisticated techniques, and more about analytics that help to solve a problem, he said in an interview.
"What I find happening all too often is there is too much focus on the new innovative techniques available -- be it supervised learning, unsupervised learning and so forth -- without understanding what exactly the business challenge is," Aggarwal said.
What the data scientist must remember is that an overelegant, complicated model is not needed. Instead, "something that can be embedded to address a specific business challenge" is required, he said.
To quantify and communicate
CDOs' interest in machine learning is evident in a recent survey undertaken by consultancy NewVantage Partners. The research found that 88.5% of data management leaders in organizations said artificial intelligence and machine learning have disruptive capabilities that could affect their firms.
The interest in AI forecasting plays into software planning. Gartner has estimated that predictive and prescriptive analytics will attract 40% of enterprises' new BI and analytics tool investments by 2020.
Still, the application of machine learning for data science in organizations needs tuning if there is to be true payoff. And while communications and storytelling are important, they must be combined with sound analytics, emphasized Sid Dalal, chief data scientist and senior vice president at New York-based AIG Inc., who discussed challenges in decision-making based on advanced analytics at the symposium.
"People need to have the ability to communicate, yes, but they actually need to both quantify and communicate," he said in an interview.
Corporations will make decisions that are not simply based on the data scientist's word, so they must present justification that helps business leaders make decisions, Dalal said.
"If you don't have any way to quantify, then it is very hard," he said. "Data scientists need to combine both of those things. Or it is like just saying, 'Believe me.'" Work with machine learning will evolve, he chided. "It is research in the making -- it's not mature. There is much more to do."
Learn about the CDO's dilemma
Listen to a podcast that discloses divergent views on the CDO role
Find out how data managers can promote innovation