The lure of “big data” is partly driven by the promise of prediction it brings to the data analytics process. Coupling the wealth of information in an organization’s systems -- from data sources old and new, and structured and unstructured -- with predictive modeling tools increases the potential to improve business strategies and gain competitive advantages by, for example, segmenting customers for targeted marketing based on forecasts of their purchasing behavior.
Thirty years ago, when predictive analytics software first arrived on the scene, it may have been out of reach for many businesses. Today, access to low-cost storage, high-performance computing power and better analytics tools is changing that. But as a growing number of companies combine those components and expand the kinds of data they’re tapping into, effective procedures for doing predictive analytics on large, varied and complex data sets should be implemented first, according to analytics professionals and consultants who specialize in analytics and business intelligence (BI).
Toward that end, they recommend that businesses evaluate their internal skills, analytics strategy and collaborative processes before diving into analyzing more data, more quickly, for predictive purposes. What follows is a detailed look at those three key focus areas.
New skills may be required. When Tagged Inc., which operates the social networking site Tagged.com, looked to ramp up its staffing earlier this year due to business growth, a vital part of the expansion plan was finding individuals with a machine-learning background to work in a new analytics group, programming systems to uncover predictive findings in data.
“It took us a while to find the right people, and it took some targeted searches,” said Johann Schleier-Smith, chief technology officer at the San Francisco-based company, which uses predictive analytics to connect people who don’t know each other rather than those who do, as on Facebook.
Tagged gets roughly 5 billion page views and generates 10 TB of data each month. When interviewing candidates for the machine-learning group, the company emphasized its ability to provide potential employees with complex data analysis problems and the technology infrastructure to solve them. “We made it clear the commitment was there,” Schleier-Smith said.
Kathleen Kane, principal decision scientist in marketing analytics at Fidelity Investments Inc. in Boston, suggested a different route: hiring someone with consulting experience who can build a predictive analytics team for an organization. A consultant with statistical analysis or predictive modeling experience should have “a lot of good ideas they can bring over,” she said.
Finding talent is a major challenge for businesses edging into advanced analytics. And according to a May 2011 McKinsey & Co. report on big data, the talent shortage is only likely to get worse. “By 2018,” the report said, “the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
Rita Sallam, a research director at Gartner Inc., said businesses can try to avoid the talent gap by building up more data scientist skills in-house or banking on external options. Organizations choosing the latter course “are going to have to rely on systems integration partners who can sell those skills as a service when companies want to build predictive analytics applications,” she added.
Build a big data strategy for predictive analytics -- and think about the ramifications. “Aim for the right target,” said John Elder, CEO of consulting firm Elder Research Inc. in Charlottesville, Va. “What is it you’re trying to predict?”
For example, that could mean determining how fine-grained a customer segmentation analysis needs to be, said James Kobielus, a senior analyst at Forrester Research Inc. “When you have the entire population of data, you can catch micro-niches,” he said, adding that massively parallel processing systems can help make such detailed analysis possible on large data sets.
John Lucker, who leads the advanced analytics and modeling practice at Deloitte Consulting LLP, said that honing the aim of predictive analytics tools may mean first paring down the data targeted for analysis into more manageable sets, to create what he calls “prototype data” for testing out the analytics process. “Once you get to the point where things are working well, set things loose on the full range of data,” he advised.
In addition, Lucker said businesses should review their data management and stewardship approaches. A tough but necessary task, especially in a big data environment, is determining what data should be retained for analysis and what can be discarded. “What’s your likely ROI from having that information?” he asked, adding that the answer might not be clear until a company starts working with data.
Lucker also suggested examining privacy, regulatory and legal constraints that might require a revamping of data policies to include how information should be used for analytical purposes -- an issue that’s becoming increasingly important as larger amounts of data and new types of information are captured and stored for data mining and predictive analytics uses.
Collaborate and communicate. Forming a dedicated machine-learning group, like Tagged did, might not be the right strategy for every business. However, building an effective organizational structure for managing the predictive analytics process is a vital step, even more so in big data environments. And a collaborative approach is best, according to analysts.
“It takes a team,” Elder said. “The best success is when you have people who understand the business problem and the data reflecting the business problem working closely with the analytics experts.”
Sallam agreed and said the analytics team must also include representatives from IT. “In any BI initiative, there has to be a tight partnership between the person who has the domain expertise, the person who has the analytic expertise and IT,” she said. “IT has to be an enabler for a lot of these tools.”
A common goal with predictive analytics is to embed the models directly into business processes so that predictive insights can be translated into business actions. Ideally, doing so involves close collaboration between groups that might not always communicate well with one another.
For example, Kane said that an employee on Fidelity’s analytics team was asked to develop a lifetime customer value model, a complicated process requiring several months of work. When the model was finished, the business managers who had requested it weren’t satisfied at first -- until each detail of the model was explained in business rather than technical terms.
To help bridge such communication gaps, Kane said her team tries to calculate the expected business value for every predictive model it builds. “We can’t always figure that out, what the business value will be, until we actually use [a model],” she said. “But we try to predict, and that helps sell it.”
As predictive analytics tools become easier to use, Sallam thinks business analysts will begin to take on more of the task of building predictive models, for big data analytics applications as well as more straightforward ones.
“The business analyst will build [higher-level] skills more and more into their job title, like the data scientist,” she said. “They’re going to have to become more sophisticated in terms of their analytic capability than they are today, and we might even see some merging of the two [positions].”