Essential Guide

Structuring a big data strategy

A comprehensive collection of articles, videos and more, hand-picked by our editors

Data science team building 101: Cross-functional talent key to success

Two experienced data scientists offer tips on best practices for setting up data science teams and incorporating big data into data science programs.

This Content Component encountered an error

Forging a data science team process is more important than hiring the hard-to-fill position of a data scientist, according to a data science team leader. That makes sense, because the job title of a data scientist, which is still being defined, has expanded to the point where it has become something no one person can do.

In various views the data scientist job description has already come to include R programmer, Scala developer, Hadoop jockey, data quality expert, domain specialist, algorithmic modeler and more. What is needed instead, according to Dan Mallinger, who serves as data science team lead at Mountain View, Calif.-based consultancy Think Big Analytics, is to focus on the overall data science process, not expecting one person to take over an entire role.

"Data scientist may be the worst defined job role in recent memory," Mallinger told an audience at the recent BigData TechCon event in Boston, Mass. Poorly defined it may be, but that does not hold back the going rate of data scientist salaries. The cost of the data scientist may be hidden in big data efforts.

"Things like Hadoop are cheap from a computer standpoint, but they are really expensive from a human resources point of view," he said. "We need a breadth of people and a breadth of skills "for new styles of high-volume analytics."

In his practice, Mallinger has seen a guide for moving from a lone data scientist rock star to a holistic, team-based process.

For more on the data scientist

Read about data science buy or build issues

Learn about a Strata speaker's take on data scientist education

Pick up on Gartner's estimates for data scientist jobs

He said one of the best data science teams he had seen was "a cross-functional group" that included a business analyst and data quality engineer, as well as product managers that tied the analytical effort back to business objectives. "They have done more than I have seen most teams do," he said.

What shared interest did they have? "They all had an interest in R," Mallinger said, referring to the popular statistical programming language. In any case, there is apparent evolution underway in team skills as variety, volume and other factors change the face of data analytics.

"Big data introduces big data business cases. They are fundamentally different," Mallinger said. These cases comprise many jobs that, in his words, "people didn't think about before."

The variety of big data was also emphasized by BigData TechCon participant Adam Laiacano, a data scientist and engineer at Tumblr, the social blogging site headquartered in New York City. He described big data as "data that was never generated before … that may have value."

He called this data "exhaust," indicating it is a byproduct of operations. It is, for example, unstructured and semi structured data generated by users' Web activity. Laiacano likened the job of the data scientist to that of an engine turbocharger, which uses a motor's exhaust to boost the amount of air entering the pistons, increasing horsepower. The data science process now is about working with that ''collateral'' data.

Laiacano said data science professionals should work to make sure big data is used, not just gathered. He said when no one is using the data it is a tell-tale clue that a big data project is on the wrong track. The first user could be a data analytics team member.

"You are user number one. If you don't use it yourself, that itself is a tip off," he said. He advised that, for a big data project, people should first find data that is useful to their own work to study, and then look to see if it is useful to business users in the organization, too.

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Essential Guide

Structuring a big data strategy
Related Discussions

Jack Vaughan asks:

Do you have an impression on the title of data scientist?

0  Responses So Far

Join the Discussion

5 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchAWS

SearchContentManagement

SearchCRM

SearchOracle

SearchSAP

SearchSQLServer

Close