The term big data is typically defined as data multiplying in volume, velocity and variety. According to new research...
by the Milford, Mass.-based Enterprise Strategy Group, big data analytics platforms are mimicking that definition: Vendor product releases are growing in number, product enhancements are multiplying rapidly and a variety of deployment options are now available.
Julie Lockner, a senior analyst at ESG and author of the firm’s Big Data Analytics Platforms, said businesses are asking how they can integrate big data technology into their architecture -- especially as it becomes more affordable and scalable -- but they don’t always understand what their options are.
Part of the haze may stem from the fluidity of big data technology and terminology, which can create a tangle of market confusion. Lockner, who refers to her research as a “101 market landscape report,” believes the tangle can be undone with internal evaluations and education.
Doing so may mean starting at the very beginning -- with definitions.
Big data, lots of options
According to ESG’s report: “Big data analytics projects are popping up on lists of priorities in advance of a clear understanding of what big data even means.”
The term can expand or shrink based on the person doing the defining. In fact, the definition has become so loose, ESG revamped its own interpretation to mean “data sets that exceed the boundaries and sizes of normal processing capabilities, forcing you to take a nontraditional approach.”
The problem, Lockner said, is that as data stretches into the terabytes, it will begin to introduce “stress fractures on current systems,” and general-purpose technologies won’t remain a cost-effective approach for big data and big data analytics. That’s when businesses should consider augmenting their data centers.
“Before, the types of organizations that invested in analytics were Fortune 100 companies with the money to invest in this kind of strategic component,” Lockner said. “Now there are more affordable options. No matter what the budget, skill set or problem.”
These days, businesses are utilizing an array of big data deployment options from a custom-developed approach, massively parallel processing databases, cloud computing services or some combination of available tools.
Adding to the discussion is the still-growing interest in the open source Apache Hadoop project, which enables distributed processing of large data sets.
“I can’t remember another technology that’s had this much of an impact since HTML,” Lockner said.
Vendors like IBM and EMC are figuring out how to integrate Hadoop into their product offerings. On Jan. 9, for example, Oracle made its Big Data Appliance generally available and it includes a partnership with Hadoop distributor Cloudera. On the other hand, vendors releasing big data products without a mention of Hadoop are receiving criticism.
Although Lockner sees a lot of promise in Hadoop and believes it will become ubiquitous in most businesses’ data centers down the road, her research notes that it’s still an emerging technology and should be applied to specific use cases.
The big data beginning
For businesses seeking to invest in big data analytics platforms, reviewing a vendor’s definition of big data and how it relates to their products is a good place to start.
“When you talk to vendors, figure out what they’re pitching,” Lockner said.
EMC, for example, has multiple big data offerings like Greenplum Database Software, Greenplum Data Computing Appliance and Isilon; all three address different kinds of problems, she said.
“You really have to peel back the layers of the onion and do the homework,” Lockner said.
To get started, Lockner recommends customers rely on vendors they have good relationships with and ask to see a presentation on their big data analytics platforms.
“That’s free information,” she said. “As people in this business try to figure out what they want to do, they should put pressure on the vendors.”
She recommended that customers also ask to learn about vendor use cases specific to their industry. This kind of information can help shed light on which vendors are the thought leaders and which vendors aren’t, Lockner said.
Businesses should rely on their internal IT departments and their more technologically savvy employees to help do some of that homework.
“There’s usually some kind of skunkworks project to look at new technology,” Lockner said, “and if businesses can find those gurus and brainstorm with them on how to do this, it’s a great place to start.”
But to really peel back the layers, a business should determine what it needs and how a vendor’s offering will address those needs. That means taking stock of the internal skills available, where the data is coming from, how quickly the analytics need to be churned out and what will need to be integrated with the new platform, according to the report.
“It’s more important to understand the business need and requirement than it is to have this great technology,” Lockner said.