Kiril Tsemekhman earned a doctorate in physics before moving into the online ad industry, eventually taking on his current role as chief data officer at Integral Ad Science Inc. So how does a physicist go on to lead cutting-edge big data analytics projects for a company that has built its business around its data collection and analysis capabilities?
"If I reflect on it, I think certain backgrounds prepare you better for data science," Tsemekhman said.
Increasingly, physics is one of those backgrounds. Tsemekhman said he's seeing a lot of people stepping out of academia and into data science jobs. On his team, his background in genetic physics is complemented by researchers with degrees in chemistry, computational neuroscience and linguistics. Other team members also have degrees in physics.
This is perhaps a necessary evolution of the data science field. At a time when demand for data scientists is rising, driven partly by the growing ranks of analytics services providers and other companies looking to monetize data, the supply isn't keeping pace. Traditional data scientists, with degrees and experience in advanced math, computer science and business disciplines, remain scarce. So, businesses are looking elsewhere -- and researchers from the hard sciences can be a good fit.
In many scientific fields, physics chief among them, statistical analysis of large data sets is common. Tsemekhman said he cut his teeth modeling the interaction of genes based on large sets of observed data. It should be no wonder that one of the most well-known people in data science circles, Kirk Borne, a principal data scientist at management consulting firm Booz Allen Hamilton who has a large social media following, got his start in astrophysics and remains active in the field today.
No strangers to analytics algorithms
The Large Hadron Collider, the world's biggest particle accelerator, operated in Switzerland by CERN, offers a good example of why physicists make good data scientists. The particle accelerator generates data at a rate of 1 MB per collision event, and such events happen at a rate of about 600 million per second. It's the mother of all big data problems. Physicists write algorithms to sift through the data in real time to collect and save only potentially interesting data. It's not hard to see how the experience translates to commercial big data projects.
In fact, investment-portfolio analytics software vendor Omega Point Research Inc. employs several people who have experience at CERN. "High-energy physics is a great training ground for data science," said Omer Cedar, co-founder and CEO of the New York company, which has built a data science platform that combines an analytics engine, machine learning algorithms and a set of data feeds it has assembled for customers to use.
Not only does the experience translate from one field to the other, but big data technology is building a bridge between the research community and enterprises.
For example, Cedar, whose company uses the Databricks distribution of Spark, said academic researchers have been among the early adopters of the open source data processing engine. He's had success hiring people from academic fields after meeting them in Spark-related discussion forums.
Still, none of this means physicists or any other types of true scientists automatically make good candidates for data science jobs. Working for an enterprise presents many challenges that are distinct from academic research, even if the nuts and bolts of data analysis are similar.
"If you bring in a lot of people who have no clue about the business, it becomes difficult to guide people to practical solutions," Tsemekhman said.
Data science team players wanted
When looking for new employees for Integral Ad Science, which is also based in New York, Tsemekhman tries to assess job candidates' personalities and ability to function as part of a team as much as their data analysis skills. He asks candidates to solve some kind of data science problem and then present their findings to the rest of his team to get a sense of how the person will mesh with others. And while some academics excel at this type of challenge, others do not, Tsemekhman admits.
Also, the level of experience most academic researchers have may be more than most businesses need. Speaking at the TDWI Accelerate conference in Boston in July, James Kobielus, a big data evangelist at IBM, said many businesses can get by with building a team of business analysts, data visualization specialists, software developers and systems architects to do the job of data scientists. Enterprises can also send employees back to school or encourage upskilling to fill any leftover skills gaps.
"You don't necessarily need a Ph.D. and five to 10 years of experience to be effective as a data scientist," Kobielus said. "Data science is not rocket science."
So rather than intentionally setting out to find academic researchers to fill data science jobs, corporate enterprises and organizations that are focused on data monetization might do better to cast their nets broadly and be open-minded to possible candidates with a range of experiences. "It's more by accident than design, but we have people from different backgrounds," Tsemekhman said.
Data science jobs are hyped, but not that many companies are hiring
For skilled data scientists, jobs are easy to come by
Brush up on the distinction between data science and analytics