This is a two-part series on in-database analytics
- Understanding in-database analytics technology: Benefits, uses and ROI
- Is in-database analytics an emerging business intelligence (BI) trend?
Requires Free Membership to View
In-database
analytics is an emerging practice that experts say can significantly cut the cost and time it
takes to do complex and data-intensive analytic processes.
The first part of this two-part series on database analytics will cover how the practice works and
how it differs from more traditional analytical methodologies. We will also look at recent and
developing market trends, such as vendor support and the emergence of open standards, as well as
in-database analytics' potential for revolutionizing advanced analytics and business intelligence
(BI). The second part of the series will discuss the paybacks of in-database analytics and how to
realize them, as well as potential deployment challenges.
Breaking down in-database analytics
According to a November 2009 Forrester Research Inc. report, titled "In-Database Analytics: Heart
of the Predictive Enterprise," the practice is far from bleeding edge. In fact, in-database
analytics is the latest instance of a longstanding approach in which developers embed application
logic into data warehouse and database systems.
In a traditional set up, predictive
analytics, data mining and other compute-intensive analytic functions are part of separate
applications or data
marts, each typically with its own system, set of data, analytic tools and programmers. As a
result, "a lot of people spend a lot of time shepherding data out of a database, profiling it,
transforming into a format a particular analytic tool can digest, and moving it to where analysts
can use it," says Neil Raden, president of Santa Barbara, Calif.-based consultancy Hired Brains
Inc.
In contrast, with in-database analytics, predictive analysis, data
mining and other analytic functions reside on the same centralized enterprise
data warehouse (EDW). This eliminates I/O-intensive extract, transform and load (ETL)
operations that can consume as much as 75% of cycle time in predictive analytics applications. It
also enables developers to exploit powerful data warehouse platform technologies, such as parallel
processing.
Empowering the enterprise data warehouse
In-database analytics is one of several recent developments that have made advanced analytics an
increasingly important, and affordable, element of corporate BI initiatives.
First, a precipitous drop in storage, computing and memory prices has helped fuel the emergence of
scaled down, low-cost database engines, data
warehousing platforms and appliances.
Second, many of these platforms support leading-edge computing technologies that enable
computing-intensive applications like advanced analytics to run more efficiently. For example,
64-bit memory enables large volumes of data needed for predictive analytics to reside in main
memory instead of on disk, which eliminates time-consuming I/O transfers. Parallel processing
enables multiple analytic processes to run in tandem. Virtualization enables companies to allocate
computing resources to analytic and database querying functions on a prioritized and as-needed
basis.
Another key factor is the recent emergence of two industry standards for advanced analytics. MapReduce,
a vendor-neutral programmability framework for complex information types, is gaining traction among
data warehouse and advanced analytics software vendors, according to Forrester.
The second standard, Hadoop, defines an open analytic processing pushdown workflow model and
distributed analytic object-file store. It has growing support from database, data warehouse and
cloud computing platform vendors, Forrester said.
Once these standards gain broad support from leading players, businesses will have far more
flexibility in choosing (and migrating between) data warehousing platforms.
At least as important, MapReduce and Hadoop can work with unstructured as well as structured data
residing in a database. This will be critical for the next generation of analytic applications,
which will be mining the complex patterns in diverse and distributed information generated by Web
2.0 applications, social networking, clickstream analysis and the like, said Forrester analyst
James Kobielus, lead author of the report.
Who can benefit from in-database analytics?
In-database analytics is potentially useful to many types of organizations pursuing advanced
analytics. It's well-suited for activities such as targeted response marketing, dynamic pricing
analysis and fraud
detection and prevention, according to analysts. It can also help executives who need to know
how best to allocate R&D money or funding for security upgrades; or who need to create a
business plan that reacts to projected market changes over the next five years.
Still, it isn't for everybody, Raden warned. He recommended that companies considering whether to
deploy in-database analytics, either as an in-house development platform or to support commercial
applications, should ask themselves the following questions:
- Do you have any problems that lend themselves to predictive modeling?
- Is the necessary data readily available, consistent, accurate and supported?
- Above all, do you have the corporate culture and the will to make use of the results you obtain?
In other words: "Are you willing to let math algorithms in a computer lead you to do things you
didn't conceive of yourself, and go against what you're already doing?" Raden asked. "A lot of
companies aren't there yet."
Elisabeth Horwitt is a freelance writer.
Business Intelligence Strategies for the CIO
Join the conversationComment
Share
Comments
Results
Contribute to the conversation