The recent recession has sparked a growing interest in making the most of “big data,” especially for businesses experiencing new forms of customer churn or those looking to explore more detailed risk assessment and fraud detection. To do so, new research by The Data Warehousing Institute (TDWI) reveals more and more organizations are seeking out new tools and new platforms to perform what it calls “big-data analytics,” and vendors are...
“We’ve seen a real growth spurt over the last three years,” said Philip Russom, the research director for data management at TDWI, adding that big data originally was a storage issue. When CPU and storage technology became more accessible, some businesses realized they now had the ability to perform big-data analytics -- a phrase Russom defines as accessing large data sets by employing a combination of advanced analytics tools like predictive analytics, data mining, statistics, artificial intelligence and natural language processing.
TDWI hasn’t been the only research firm to take notice of big data’s impact on the industry. Earlier this summer, big data, coupled with extreme information processing and management, made its first appearance on Gartner Inc.’s annual hype cycle of emerging technologies. The cycle, which tracks how surfacing technologies progress from hype into high-growth adoption, situates big data on the initial rising slope, heading toward “the peak of inflated expectations,” an indication that it still has a long journey in front of it.
Overall, the TDWI survey results provide a snapshot of big-data analytics, revealing that even the term itself hasn’t quite been synchronized among the survey’s 325 participants of corporate IT professionals, business users and consultants from small, midsized and large organizations.
While only 18% of respondents actually call this kind of work “big-data analytics,” 34% of organizations reported they are tapping into large data sets using a collection of advanced analytics tools.
A majority of the survey’s participants reported interest in applying big-data analytics to customer relations -- from identifying new sales and marketing opportunities to understanding customer behavior. Forty-five percent of respondents also hope diving into large data sets will provide better business insight.
Results showed that about 75% of businesses have deployed advanced analytics, though 40% indicated they are not yet using those tools to tap into big data, and the survey also held a few surprises.
Advanced data visualization, for example, topped the list of the hottest growing technologies. The tools enable users to produce sophisticated charts, maps and graphs -- layered with information or capturing more ephemeral data, like relationships -- and are capable of including a large number of data points.
“I expected to see a lot of the more nerdy stuff,” Russom said. “Advanced data visualization is not really an analytics tool and it’s not necessarily for big data either.”
Another surprise for Russom: the lack of real-time data collection and analytics.
“It’s been a journey for all kinds of organizations to figure out how to store data, how to do advanced analytics and how to put the two together,” Russom said. “Beyond that is real time, and there weren’t nearly as many users doing real-time data collection or analysis as I thought.”
In fact, one survey respondent wrote: “We survived the scalability crisis by buying cheap infrastructure, such that big data is now a good thing instead of a problem. Now, if we could just solve the real-time data processing crisis, we’d be all set!”
Crisis or no, Russom -- who calls this the “velocity” of big data’s typical three-V “volume, velocity and variety” definition -- said real-time collection and analytics tools are maturing.
“In the past, real-time technology involved lots of pieces and tools from different vendors that didn’t always play well together,” he said. “Now vendors are better at interoperability, even in real time.”
Get into the big data game
For smaller businesses yet to step into the big data game, Russom recommends finding an affordable tool. If business intelligence (BI) skills are scarce, seek out tools that are easy to use -- such as products from Tableau and QlikTech.
“Both have lots of capabilities for advanced data visualization,” he said, and could provide a BI environment even without a data warehouse.
While both can work with large data sets, he warned that Tableau requires a 64-bit server, which may mean an additional investment. Plus, without a data warehouse, the tools won’t store historical data.
For businesses with enterprise data warehouses, Russom said a fundamental question for big data analytics will come down to architecture. A majority of businesses reported their enterprise data warehouse is the preferred analytics platform today. Russom, though, noted that not all data warehouse designs can handle advanced analytics.
“This is not a failure of vendor products,” he said. “Users make decisions and build warehouses that are well-suited to perform the jobs they have to deliver.”
Businesses will need to decide “whether to manage analytic big data in a shared, centralized EDW [enterprise data warehouse] or in a separate-but-related database,” the report reads.
Hosting a separate but nearby database for data storage, modeling and queries is not new, Russom said. Additionally, on-the-side databases, such as Hadoop, can help with workloads businesses may not want to run on their enterprise data warehouse. While Russom hasn’t seen a lot of Hadoop adoption, he has observed businesses experimenting with the open source software.
Another one-third of survey respondents anticipated changing their analytics platform within three years, with some considering a brand switch. For those businesses, Russom said new vendors such as Netezza, XtremeData, EMC Greenplum may suit their appliance needs, while Kognitio, ParAccel, Vertica and 1010data also provide options specifically for big-data analytics.