BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Complexity is considered an enemy of most IT projects. The goal is typically to simplify and tame sprawling hardware and application implementations. But when it comes to big data, this general rule may not apply.
No doubt about it, big data complexity is growing. Today there are separate tools for ingesting data, storing data, transforming data, moving data, analyzing data and visualizing data. Organizations may even have different systems depending on whether they are working with streaming data or historic data. This has created a tangled web of data storage and analytics systems.
But in a webinar hosted by the International Institute for Analytics, Mike Lampa, partner at Denver-based analytics consulting firm Archipelago Information Strategies LLC, said there is a good reason to have all those systems.
"The analytics space has gotten amorphous. It's been all over the place," Lampa said.
The reason is that traditional data architecture can't keep pace with demands. Lampa said the standard relational database still does a good job of ingesting moderate amounts of mostly structured data, but when volume and variety increases, it struggles. It still has a place, but businesses dealing with true big data problems may need to supplement their relational database with something more advanced, like NoSQL or Hadoop. Similarly, businesses may have multiple needs for conducting analyses and presenting findings, which could require multiple systems.
Lampa laid out a map for an enterprise data warehouse architecture in which data goes to a staging area where the focus is on quality and usability. For many organizations this may mean Hadoop. From there the data will either be stored in a database or moved into analytic sandboxes. Some of these sandboxes may be dedicated to retrospective business intelligence functions while others are focused on mining data for meaningful correlations. The next layer could be an analytic application that puts data mining findings into production on an ongoing basis, such as monitoring sensor data for signs of machine failure. Or it could be sent to a visualization application.
This may sound like big data complexity run amok, but Lampa said it is all part of building a modern data architecture that supports innovation and helps businesses become more competitive.
"I have to think about those things in concert," he said. "Why am I doing analytics? It's about meaningful business insights. There is a need for architects to develop patterns that have the potential to be meaningful."
Some commentators have suggested that the current ecosystem of data storage and analysis tools is too crowded and that businesses aren't served by having to implement a unique application for every function. This may not be the case forever, but it is the reality businesses are dealing with today.
This is why, Lampa said, it is important for businesses to look beyond the big data hype and figure out for themselves what tools they need to fit their specific business problems. Otherwise big data technology could consume IT without delivering any return on investment.
"We're inundated with this big data hype, but what does it mean from a practitioner's point of view?" he said.
Big data complexity strains real-time BI systems
Big data systems require new thinking on data integration
Education is not keeping up with big data complexity