PRO+ Premium Content/E-Books

Thank you for joining!
Access your Pro+ Content below.
February 2015

Strategies, tactics and tools for making big data applications count

Big data management and analytics initiatives can give organizations a wealth of insight into their internal operations, market trends, customer behavior and other business factors. But it isn’t easy to plan, implement and manage big data applications involving a variety of data and technologies such as Hadoop clusters and NoSQL databases. If companies aren’t careful, their big data investments could go for naught.

This e-book provides advice on making key aspects of the process work, as well as real-world examples of big data projects in various industries. The first chapter focuses on combining stream processing and big data technologies to support real-time analytics capabilities that can drive faster and more informed decision making. The second offers guidance on managing big data analytics efforts without stifling the work of the data scientists and other analysts who are trying to uncover valuable business information. And the third examines the opportunities for visualizing big data analytics findings to help business users better understand them—and the challenges that analytics teams face in doing so effectively.


  • Data-hungry organizations use stream processing to digest big data

    In the data age, information never stops flowing -- and its breakneck speed never flags. Organizations with a need to harness that speed don’t want to miss the chance to take advantage of even a single 1 or 0 or unstructured bit of information. Some are dealing with so much data -- think terabytes on a daily basis -- they need a wholly novel way of handling it. In this e-book chapter, SearchBusinessAnalytics executive editor Craig Stedman looks at companies experimenting with a blend of real-time stream processing and big data technologies to quickly evaluate and act on a deluge of data. Imaging products and services provider DigitalGlobe is one of them, beta-testing a real-time analytics service that’s powered by a Hadoop cluster. The company rifles through reams of geo-tagged social media posts to help add to data that its satellites are taking in -- and then serves up that information to its customers, which include government agencies, global development organizations, and oil and gas companies. “What we want to do is go from reporting on events to anticipating them, and ultimately changing outcomes,” said Tony Frazier, a senior vice president at DigitalGlobe. “The more we can get ahead of those events, the more we can equip the people who are going to take action.”

    Stream processing hasn’t hit the mainstream yet. According to a Gartner survey, just 22% of 218 respondents with ongoing or planned big data initiatives said they were using stream or complex event processing technologies. That could change, thanks to new technologies like the Apache Spark processing engine, the Storm real-time computing system and HBase database for algorithm-based analytics. Consultant William McKnight says the increasing popularity of advanced analytics tools and techniques will generate “a more voracious need for timely data” -- for organizations that need that kind of speed and power. Financial institutions can use it to analyze stocks or detect fraud, say. Manufacturers might tap into stream processing to gather data from sensors or industrial equipment and spot maintenance problem before there’s a failure.

    But adding streaming processing to any big data technology doesn’t only build muscle, it introduces new challenges. Performance monitoring, for example, is more difficult. “You need to be cognizant that this is going to be running 24 hours a day, seven days a week,” said Russell Cardullo, technical lead on a Spark implementation for online advertising company Sharethrough. Other challenges include building a hearty-enough architecture that can handle the workload and ensuring that analytics and business processes can harness the data.

  • Balance innovation and governance with big data projects

    Analytics opportunities are booming, and surveys show big data projects are growing as well. In the new big data world, that should be exciting news for data scientists and business analysts itching to come up with ways to use all of the information at their fingertips. But the big data frontier isn't all about letting data scientists run wild with the information, and if the public response to the Edward Snowden NSA revelation is any indication, customers won't be happy if organizations let analysts have a free-for-all with their information. Balance is key.

    For some companies, lawyers are getting involved with big data projects. At personal finance software company Intuit, lawyers, analytics managers, data scientists and others teamed up and made rules for accessing and analyzing different sets of customer data. This method may seem like a scary or stifling prospect for some data scientists, but such governance is inherently more collaborative and less controlling, which is a good thing for the analytics and the legal teams. Because the two groups with seemingly opposing objectives worked together instead of reacting to each other, they were better able to meet their objectives.

    But a successful approach to big data projects doesn't have to mean full-on collaboration between lawyers and data scientists. Companies can find the mix of input that works from them. In this e-book chapter, get advice on striking the right balance between maintaining control over the big data analytics process and giving data scientists the freedom they need to do their jobs effectively.

  • Big data picture complete with data visualization projects

    IT directors at companies that are actively pulling in reams of big data may feel like they have arrived. They've put the tools in place to gather information, and they've plundered the data for treasures. But all of the discoveries are useless if they aren't presented in the right way to the right people. Data visualization projects are an essential part of big data and analytics initiatives.

    So what is the key to successful data visualization projects? Paul Bradley, chief data scientist at healthcare software company ZirMed, said visualizations shouldn't be overwhelming to the reader. They must be catered to their audience. For some audiences, that means making everything as simple as possible. Other visualizations are for a more technical audience and can contain more complicated material.

    In this e-book chapter, SearchBusinessAnalytics site editor Ed Burns writes about how companies approach data visualization projects. Find out what tools they've used, where they've run into trouble and where they've had success.


More PRO+ Content

View All