The need to prove the business benefits of big data applications and platforms has taken center stage in a growing number of mainstream organizations, and it isn't always an easy task for IT and analytics managers.
For example, a big data deployment wasn't a slam-dunk decision for Blue Cross Blue Shield of Michigan.
"For a lot of organizations like ours, big data has not yet become a core foundation of running the business," said Beata Puncevic, director of analytics, data engineering and data management at the medical insurer. "When you go in and talk to a lot of [executives] about investing in a big data platform, it completely does not resonate with the challenges of the day."
At Blue Cross and other healthcare businesses, those challenges include low profit margins that don't leave a lot of money for technology innovation, plus resource and skill-set issues, and a relatively conservative culture, according to Puncevic. As a result, she and her colleagues had to put in some extra effort to get approval and funding for a Hadoop data lake that went into use in May.
Puncevic set up a team to develop an ROI framework for the data lake project, with metrics on the projected big data benefits based on before and after calculations. In building the business case, she also focused on three IT-related improvements: reducing data processing and management costs, enabling more insightful analytics, and creating a more agile and adaptable technology architecture.
In addition, Puncevic said she worked to obtain corporate-level funding for the initial rollout and subsequent project phases, "so we don't have to worry about getting funding from individual business units" for different aspects of the big data initiative.
The strategy worked, and the Detroit-based insurer is on a path to fully constructing the big data platform over the next three to five years. The benefits of big data are "potentially tremendous" for the healthcare industry as a whole, Puncevic said at Hadoop Summit 2016 in San Jose, Calif., last week. Besides lower IT expenses, she cited an opportunity to reduce healthcare costs, while also improving the quality of patient care and boosting preventive medicine efforts -- all through better analytics.
On the road to big data benefits
The value of big data is definitely real for Progressive Casualty Insurance Co. and its auto policy customers, said Brian Durkin, an innovation strategist in the company's enterprise architecture group. Progressive uses a Hadoop cluster partly to power its Snapshot program, which awards discounts to safe drivers based on operational data collected from their vehicles. The insurer has handed out more than $560 million worth of discounts since launching the program in 2008, Durkin said in another conference session.
"It's not some little science experiment that we're running," he said. "We're fully invested in it, and it means a lot to our customers."
To track participating drivers and calculate discounts, huge volumes of data get processed and analyzed in the cluster, which, like the one at Blue Cross, is based on the Hortonworks Hadoop distribution. Progressive has collected data on 2.4 billion trips, and it retains all of the information. For analyzing driving patterns to identify bad habits drivers can be alerted to, "it's the older data that's more valuable," Durkin said. "So, we have to keep everything and analyze everything."
Crunching the data requires a lot of processing resources, and Progressive has deployed various advanced analytics tools for its data scientists to use, including SAS, the R programming language and H2O. But business executives have been willing to foot the bill, said Pawan Divakarla, data and analytics business leader at the Mayfield Village, Ohio, insurer.
"It's a very data-driven company," he said. "We want people to have intuition and ideas, but they need to prove them out with data."
Hadoop's higher-value proposition
Retailer Macy's Inc. runs a mix of BI and analytics applications off of a Hortonworks-based Hadoop system to support marketing, merchandising, product management and other business operations. On a daily basis, thousands of business users access hundreds of BI dashboards fed by the cluster -- making it a key component in decision making, said Seetha Chakrapany, director of marketing analytics and customer relationship management systems at Cincinnati-based Macy's.
"You don't want to just see Hadoop as a cheap storage solution," Chakrapany said. "Its value is much higher than that."
Seetha Chakrapanydirector of marketing analytics and customer relationship management systems, Macy's Inc.
Hadoop is still maturing and has "a lot of rough edges," he cautioned, saying new users should expect some instability and missing IT management functionality. "If you come in with the typical IT mindset that this has to be rock-solid, it's not going to be the right [technology]." Nonetheless, he said he thinks Hadoop "could truly be an enterprise data analytics platform" for Macy's.
But Chakrapany isn't taking the benefits of big data analytics and Hadoop-based BI applications for granted. Last year, he set up a team of evangelists to sell the merits of the big data environment internally and lobby more business units to use it. His group also tracks the business benefits generated by the Hadoop platform, in both qualitative and quantitative ways.
"We don't want to just be counting the number of users, the number of queries, how much data [is analyzed] -- those are just numbers," Chakrapany said. "The key piece is, what has this done for the business?"
More from Hadoop Summit 2016: IT teams sharpen their Hadoop management processes
Get trend stories, case studies and advice in our guide to big data analytics
Business focus needed on analytical modeling to maximize big data value