Managing Hadoop projects: What you need to know to succeed
A comprehensive collection of articles, videos and more, hand-picked by our editors
For companies just starting out with big data technologies, acquiring the necessary skills can be the biggest challenge to overcome, according to Shawn Rogers, vice president of research for business intelligence (BI), data warehousing and analytics at consultancy Enterprise Management Associates Inc. (EMA) in Boulder, Colo. Only the biggest companies are likely to have employees who already possess the programming know-how, analytics expertise and other big data skills that are needed to fully take advantage of technologies such as Hadoop clusters and NoSQL databases, Rogers said in a video interview recorded at the 2013 Pacific Northwest BI Summit. "All of them come with a learning curve, and so even if you have people that are good enough to get into this type of work, most companies are investing in some sort of training," he said.
Rogers added that while such learning curves are to be expected with new and advanced technologies, EMA's research shows that the skills gap is the primary roadblock to adoption of big data tools by companies. For example, one online job site that he checked had about 800 available jobs listed with "data scientist" or "data science" in their titles. "That's a lot of job openings," he said. And data scientists don't come cheap: All of the jobs Rogers looked at promised six-figure salaries.
A survey jointly conducted by EMA and 9sight Consulting in 2012 also found that despite all of the hype surrounding Hadoop, most companies are using a variety of technologies to support their big data applications. Seventy-two percent of the 255 respondents said their organizations were using more than one of eight technology platforms that EMA mapped into an architectural framework it calls the hybrid data ecosystem. The framework includes Hadoop systems and NoSQL data stores but also more conventional technologies such as enterprise data warehouses and specialized analytical databases. "There's this assumption that in order to be doing big data, you must be utilizing the Hadoop framework, and that's not necessarily true," Rogers said. "Big data is alive and well throughout the enterprise and throughout different systems that are capable of handling those types of workloads."
While assumptions about the pervasiveness of Hadoop may be exaggerated, the potential importance of big data analytics initiatives is not, according to Rogers. "If you're not at least thinking about big data right now, you're starting to fall behind the curve a little bit," he said. "I don't think that you have to have a project under way today in order not to be worried, but I think you ought to be thinking about it and be looking for a good opportunity to do a highly focused, shorter-term project to get your feet wet with it."
Rogers offered further thoughts on issues related to big data skills and technologies in the interview with SearchBusinessAnalytics Executive Editor Craig Stedman at the BI Summit, which took place in Grants Pass, Ore. Viewers of the eight-minute video will:
- Hear more about the findings from the EMA/9sight survey on big data adoption;
- Learn about the big data skills gap and what some vendors are doing to try to close it; and
- Find out about the growing need for data scientists to analyze pools of big data.