Access your Pro+ Content below.
Data-hungry organizations use stream processing to digest big data
Sponsored by SearchBusinessAnalytics
This chapter is included in the Strategies, tactics and tools for making big data applications count E-Book.
In the data age, information never stops flowing -- and its breakneck speed never flags. Organizations with a need to harness that speed don’t want to miss the chance to take advantage of even a single 1 or 0 or unstructured bit of information. Some are dealing with so much data -- think terabytes on a daily basis -- they need a wholly novel way of handling it. In this e-book chapter, SearchBusinessAnalytics executive editor Craig Stedman looks at companies experimenting with a blend of real-time stream processing and big data technologies to quickly evaluate and act on a deluge of data. Imaging products and services provider DigitalGlobe is one of them, beta-testing a real-time analytics service that’s powered by a Hadoop cluster. The company rifles through reams of geo-tagged social media posts to help add to data that its satellites are taking in -- and then serves up that information to its customers, which include government agencies, global development organizations, and oil and gas companies.
“What we want to do is go from reporting on events to anticipating them, and ultimately changing outcomes,” said Tony Frazier, a senior vice president at DigitalGlobe. “The more we can get ahead of those events, the more we can equip the people who are going to take action.”
Stream processing hasn’t hit the mainstream yet. According to a Gartner survey, just 22% of 218 respondents with ongoing or planned big data initiatives said they were using stream or complex event processing technologies. That could change, thanks to new technologies like the Apache Spark processing engine, the Storm real-time computing system and HBase database for algorithm-based analytics. Consultant William McKnight says the increasing popularity of advanced analytics tools and techniques will generate “a more voracious need for timely data” -- for organizations that need that kind of speed and power. Financial institutions can use it to analyze stocks or detect fraud, say. Manufacturers might tap into stream processing to gather data from sensors or industrial equipment and spot maintenance problem before there’s a failure.
But adding streaming processing to any big data technology doesn’t only build muscle, it introduces new challenges. Performance monitoring, for example, is more difficult. “You need to be cognizant that this is going to be running 24 hours a day, seven days a week,” said Russell Cardullo, technical lead on a Spark implementation for online advertising company Sharethrough. Other challenges include building a hearty-enough architecture that can handle the workload and ensuring that analytics and business processes can harness the data.
Access this PRO+ Content for Free!
More PRO+ ContentView All