big data analytics

This definition is part of our Essential Guide: Structuring a big data strategy
Contributor(s): Lisa Martinek and Craig Stedman

Big data analytics is the process of examining large data sets to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits.

The primary goal of big data analytics is to help companies make more informed business decisions by enabling data scientists, predictive modelers and other analytics professionals to analyze large volumes of transaction data, as well as other forms of data that may be untapped by conventional business intelligence (BI) programs. That could include Web server logs and Internet clickstream data, social media content and social network activity reports, text from customer emails and survey responses, mobile-phone call detail records and machine data captured by sensors connected to the Internet of Things

Semi-structured and unstructured data may not fit well in traditional data warehouses based on relational databases. Furthermore, data warehouses may not be able to handle the processing demands posed by sets of big data that need to be updated frequently or even continually -- for example, real-time data on the performance of mobile applications or of oil and gas pipelines. As a result, many organizations looking to collect, process and analyze big data have turned to a newer class of technologies that includes Hadoop and related tools such as YARN, MapReduce, Spark, Hive and Pig as well as NoSQL databases. Those technologies form the core of an open source software framework that supports the processing of large and diverse data sets across clustered systems.

In some cases, Hadoop clusters and NoSQL systems are being used as landing pads and staging areas for data before it gets loaded into a data warehouse for analysis, often in a summarized form that is more conducive to relational structures. Increasingly though, big data vendors are pushing the concept of a Hadoop data lake that serves as the central repository for an organization's incoming streams of raw data. In such architectures, subsets of the data can then be filtered for analysis in data warehouses and analytical databases, or it can be analyzed directly in Hadoop using batch query tools, stream processing software and SQL on Hadoop technologies that run interactive, ad hoc queries written in SQL.

Big data can be analyzed with the software tools commonly used as part of advanced analytics disciplines such as predictive analyticsdata miningtext analytics and statistical analysis. Mainstream BI software and data visualization tools can also play a role in the analysis process.

Potential pitfalls that can trip up organizations on big data analytics initiatives include a lack of internal analytics skills and the high cost of hiring experienced analytics professionals. The amount of information that's typically involved, and its variety, can also cause data management headaches, including data quality and consistency issues. In addition, integrating Hadoop systems and data warehouses can be a challenge, although various vendors now offer software connectors between Hadoop and relational databases, as well as other data integration tools with big data capabilities.

This was last updated in October 2014

Next Steps

IBM SPSS predictive analytics tools for big data may be the best option for your enterprise. KNIME open source data analytics delivers commercial extensions for big data, cluster operations and collaboration.  

Revolution R Open and Revolution R Enterprise are two statistical analytics products provided by Revolution Analytics.

The Teradata Aster Discovery Platform features a number of crucial components, including the Aster database and a version of R.

Continue Reading About big data analytics



Find more PRO+ content and other member only offers, here.

Join the conversation


Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How is "big data" different from "data mining"?
would like to know role of intelligent software agents in big data analytics
Can anyone start his or her career in data analytics? Whta basics it need?
At a very high level, Data mining is looking for data based on specifc requests from the client. Big data is analyzing patterns to understand business and create new analytics.
Thanks. Great piece. Although the competition has changed during past two years and as mentioned, Hadoop and especially map reduce platforms got much more attention and importance. Due to variety of data sourced and amount of data, players such as tableau, splunk, and cloudera getting more and more attention.
How could big data help segmenting different customer group needs
What is the difference between using a traditional Data Warehouse and a solution on top of it (Like Cloudera) or using Hadoop for big data analytics (Somethink like Hunk (Splunk) or Datameer ( ))? Which one is better specifically for a medium size company?
Having understood what Big Data is all about, can someone please give a list of all the popular Big data software innovators. I have a small list with me which includes Companies like Amazon , IBM etc. What I need is something which is affordable for my company. I've heard of a company called Qburst Technologies which affords to give its customers satisfaction coupled with low pricing.
Big data analytics is becoming a trending topic. Once of the biggest benefits is whenever you take the technology and use if for the healthcare industry. Companies like Due North Analytics are able to take the patients data to determine how affective treatment is, prescriptions, and future cost. All of which help the healthcare industry become more efficient. Learn more about there company and predictive analytics.
What kind of big data analytics challenges does your organization face? And what are you doing to overcome them?
They are many issues an organization face if the implement big data 
Mainly performance issues if system architecture allows optimization then issues can be resoled.

Other issue is with data accuracy and validation?

Having gone through several writings on Big data analytics , I am convinced that there are several areas in which it's application in certain areas of our operation could increase our market share and ultimately enhance our bottomline as a bank playing in retail sector 
Big data is the most important aspect which all have to be aware of in the field of buisness..
If one want to be in some of the best management companies one must know about all these aspects..
Please I need help , I want to obtain Big Data Analytics certification. My background is in business (Banking - treasury management and healthcare. I have a BS in economics. Please how do I get started?, any suggetions?
Any suggestions?


File Extensions and File Formats