BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
“Big data” has become one of the most talked-about trends -- and yes, buzzwords -- within the business intelligence (BI), analytics and data management markets. A growing number of organizations are looking to BI and analytics vendors to help them answer business questions in big data environments. Unfortunately, gaining visibility into pools of big data is easier said than done. And with vendors marketing a wide variety of technology offerings aimed at addressing the challenges of big data analytics projects, businesses may be hard-pressed to identify the one that best meets their needs.
So, what is big data -- really? A recent story by the IT publication eWeek offered the following take on it, based partly on Gartner Inc.’s definition of the term: “Big data refers to the volume, variety and velocity of structured and unstructured data pouring through networks into processors and storage devices, along with the conversion of such data into business advice for enterprises.”
More on managing big data analytics programs
Find out why organizations are turning to data scientists to help analyze big data
Read about the familiar feel of big data analytics best practices
Get consultant John Weathington’s take on the big deal about big data analytics
That hits the mark in terms of data management and the analytics part of the equation, but it misses the essential aspect of the business challenges surrounding big data: complexity. For instance, big data installations often involve information -- from social media networks, emails, sensors, Web activity logs and other data sources -- that doesn’t fit easily into traditional data warehouse systems.
And in many cases, all of that disparate data needs to be pulled together in order to make sense of it on a broader level. That can have big implications for business rules, table joins and other components of big data analytics systems. The complexity of big data is what really makes it different from more conventional data when it comes to storage and query management, and it’s the main reason why analytical database and data analytics software vendors have had to step up their game to help companies deal with big data.
Understanding big data is the first step in assessing your technology needs and putting a big data analytics plan in place. The second is understanding the market and the current trends that are affecting organizations looking to derive business value, and competitive advantages, from increasingly large and diverse data sets.
Big agendas for big data analytics projects
Many businesses have always had large data sets, of course. But now, more and more companies are storing terabytes and terabytes of information, if not petabytes. In addition, they’re looking to analyze key data multiple times daily or even in real time -- a change from traditional BI processes for examining historical data on a weekly or monthly basis. And they want to process more and more complex queries that involve a variety of different data sets. That might include transaction data from enterprise resource planning and customer relationship management systems, plus social media and geospatial data, internal documents and other forms of information. Increasingly, companies also want to give business users self-service BI capabilities and make it easier for them to understand analytical findings.
With vendors marketing a wide variety of technology offerings aimed at addressing the challenges of big data analytics projects, businesses may be hard-pressed to identify the one that best meets their needs.
All of that can play into a big data analytics strategy, and technology vendors are addressing those needs in different ways. Many database and data warehouse vendors are focusing on the ability to process large amounts of complex data in a timely fashion. Some are using columnar data stores in an effort to enable quicker query performance, or providing built-in query optimizers, or adding support for open source technologies such as Hadoop and MapReduce.
In-memory analytics tools may help speed up the analysis process by reducing the need to transfer data from disk drives, while data virtualization software and other real-time data integration technologies can be used to assemble information from disparate data sources on the fly. Ready-made analytics applications are being tailored to vertical markets that routinely have to deal with big data -- for instance, the telecommunications, financial services and online gaming industries. Data visualization tools can simplify the process of presenting the results of big data analytics queries to corporate executives and business managers.
Organizations that fit into the categories described above in relation to their data and analytics needs should begin by considering the following issues and questions, among others, before creating an implementation plan and finalizing their big data infrastructure choices:
- The required timeliness of data, as not all databases support real-time data availability.
- The interrelatedness of data and the complexity of the business rules that will be needed to link various data sources to get a broad view of corporate performance, sales opportunities, customer behavior, risk factors and other business metrics.
- The amount of historical data that needs to be included for analysis purposes. If one data source contains only two years of information but five are required, how will that be handled?
- Which technology vendors have experience with big data analytics in your industry, and what is their track record?
- Who is responsible for the various data entities within an organization, and how will those people be involved in the big data analytics initiative?
Those considerations don’t constitute in-depth requirements planning, but they can help businesses get started on the road to deploying a big data analytics system and identifying the technology that will best support it.
ABOUT THE AUTHOR
Lyndsay Wise is president and founder of WiseAnalytics, an independent analyst firm based in Toronto that focuses on business intelligence, master data management and unstructured data management.