Sergey Nivens - Fotolia

Lack of skills remains one of the biggest data science challenges

Many enterprises are struggling with the complexity of today's big data and data science ecosystem, though they recognize the opportunity of emerging practices.

The shortage of trained data scientists remains at the top of the list of data science challenges enterprises face...

today, according to new research from TDWI.

"We hear constantly that the biggest challenge any organization faces in a data science environment is finding the right skills," said Fern Halper, vice president and research director at TDWI, based in Renton, Wash., in a webcast highlighting the recent findings.

The research surveyed more than 300 enterprises on their experiences with big data and data science. The two topics are increasingly blending into one another, as organizations need workers who can make sense of the massive troves of data they've been collecting over the last several years.

Other common challenges cited by survey respondents included lack of clarity around who owns certain data, lack of understanding of big data tools, lack of enterprise architectures needed to harness big data, security and privacy concerns, and insufficient governance protocols.

The technology piece appeared particularly vexing. Halper said many new tools have emerged within the last few years, including Hadoop, Spark, Python and others, and enterprises are having a hard time staying on top of all these rapid developments.

"Some respondents thought there were too many technologies and a lot of hype out there," she said. "They didn't know what to do. Others thought things are changing so fast, and they're not nimble enough to maintain the best architectures."

For now, enterprises are sticking with the tools they know, in part, to address these data science challenges. About 80% of survey respondents said they currently use data warehouse tools as their primary data source. For analysis, simple query and data visualization tools top the list of most used. Over the next two years, data warehouse tools will remain prominent, but the top two technologies enterprises plan to add during that time are Hadoop and open source R.

Halper said the results show clear momentum around unstructured data querying and predictive analytics, including machine learning. At the same time, it doesn't look like these emerging tools and practices are going to completely unseat more tried-and-true tools in the foreseeable future.

"The data warehouse isn't going away, but it's being supplanted by these other types of platforms and creating an ecosystem," she said. "There's lots of momentum around predictive analytics. It's a hot technology, and machine learning is making it hotter."

Next Steps

IoT projects hamstrung by shortage of skilled data scientists

Skills shortage keeps SMEs from benefiting from emergence of big data

Data science in the cloud requires the right mix of skills

Dig Deeper on Advanced analytics software