In this edition of Talking Data, we recap the Spark Summit, including critiques of the Spark streaming engine and where the open source data processing ecosystem goes from here.
Each year, for the past few years, the Spark Summit has made news. But with so much happening over the course of the conference, there's always plenty of noteworthy information that falls through the cracks.
In this edition of the Talking Data podcast, we look at some of the underreported happenings at the Spark Summit in order to get a more complete picture of how enterprises perceive the Spark platform and where it stands in its march toward maturity.
Of particular interest were comments made by Doug Cutting, chief architect at Hadoop vendor Cloudera and one of the original creators of Hadoop. He reiterated his support for Spark as a replacement for the MapReduce execution engine that has been a central component of Hadoop. But he cautioned developers and engineers against falling in love with any one technology, saying that, just as elements of Hadoop have been displaced by Spark, so too will Spark eventually be outmoded.
There were also several engineers on hand to talk about their experiences implementing the streaming analytics functions that come with Spark. Streaming is definitely one of the hottest areas of analytics today and the fact that Spark comes with packages to help enterprises quickly get up and running with streaming certainly makes it attractive. But some engineers who have been working with Spark Streaming reported that it still has plenty of bugs to work out and implementing it may not be as simple as some businesses interested in streaming analytics might hope.
Listen to the podcast for more details on these topics and other developments that came out of this month's Spark Summit.
Details emerge about the forthcoming release of Spark 2.0
Data scientists love Spark's built in libraries
Spark benefits from growing interest in the internet of things