Guide to big data analytics tools, trends and best practices
A comprehensive collection of articles, videos and more, hand-picked by our editors
Mike Lurye has evaluated many of the software tools that have become nearly synonymous with big data. But the Time Warner Cable executive feels that some of the big data opportunities and technologies getting the most attention today have relatively little business value, at least for his organization's current requirements.
Lurye, senior director of business intelligence architecture at Time Warner Cable Enterprises LLC, spoke about his company's big data challenges and business opportunities at the 2014 TDWI BI Executive Summit in Las Vegas. Two years ago, the country's second-largest cable TV provider wanted to create a database of the programming watched by every subscriber so advertisers could view reports and identify the shows watched by their target audiences more easily. Given the large volume of data involved in the project, big data tools like Hadoop or a NoSQL database might seem like the right choice. But Lurye said those options each had significant drawbacks from Time Warner Cable's perspective.
On the whole, Lurye is skeptical of the concept of big data. He sees the term as being too nebulous and would prefer to focus his attention on specific technologies that can help solve specific business problems. "I never met the person who invented the term big data, but that person is brilliant," he said. "It's a brilliant marketing term. It's not a technical term."
Changing the channel on Hadoop
Hadoop is the most-hyped big data technology now, and a new Hadoop 2 release that became available in October 2013 broadens its potential uses beyond MapReduce batch-processing applications. But Lurye said that when he evaluated Hadoop, he didn't see the business value for his company. At the time, he felt it was an immature technology that likely would present new technical challenges. Integrating a Hadoop cluster with existing data sources and getting reports out of it seemed like difficult tasks given the skills that Time Warner Cable had in-house.
Lurye also looked at NoSQL databases for the project, but in the end he decided that they also weren't a good fit. He said most of the NoSQL technologies he reviewed required special programming skills that are hard to find. While NoSQL databases offer some interesting capabilities, the fact that Time Warner Cable would have had to hire new programmers specifically to operate one limited the technology's value.
"It would bring us back to the days where, to retrieve any data, someone would have to write code," Lurye said. "Why would we want to do that?"
Ultimately, Time Warner Cable went a more traditional route for the BI and analytics project. Viewing data gets loaded into a standard relational database. The company then uses an in-memory BI system from MicroStrategy Inc. to get reports out of the database and deliver them to advertisers.
Door still open to big data opportunities
But even though the big data technologies weren't a good fit in this case, that doesn't mean Time Warner Cable -- which in February agreed to be acquired by Comcast, the top cable company in the U.S. -- has written them off for good. Lurye said he thinks Hadoop has matured in the time since he last evaluated it and now looks like it could be a cost-effective option for data integration.
Read more about big data opportunities
Find out what Gartner has to say about big data hype
Learn why in-memory big data systems are going mainstream
Additionally, the cable provider is looking to start incorporating more real-time data, which might help predict who will watch a particular show, enabling vendors to perform ad bidding in real or near real time as advertisers try to target specific types of viewers. That application might be a good fit for NoSQL software, according to Lurye.
But even as the company considers possible deployments of new technologies, it isn't likely to get rid of the systems it has recently put in place any time soon.
"We see [big data technology] as a complement," Lurye said. "It's not going to replace a warehouse with 15 months of data. But it could make data available much sooner than what we have today."