Benjamin Haas - Fotolia

Data science as a service provides instant access to analysts

Data scientists are in high demand, but in such scarce supply that some companies outsource their data for analysis. DataScience Inc. CEO Ian Swanson explains how it works.

Companies need to squeeze value out of the vast amounts of data they collect, but many can't seem to find enough data scientists to do it. That's where data science as a service comes in.

Organizations in those straits can outsource their raw data to firms such as DataScience Inc., a fast-growing startup in Culver City, Calif. Its team of analytical minds cleans up the data and uses a mix of complex modeling tools, homegrown software and intellectual curiosity to deliver insights that clients can use to improve products and services; grow their customer base; and, ultimately, increase revenue.

SearchBusinessAnalytics spoke with DataScience CEO and founder Ian Swanson about the burgeoning data science as a service space. He discussed what his team of data scientists provides that data analysis tools alone cannot, the ways in which data science drives revenue, and how his venture-backed company has been able to acquire and develop a strong stable of data scientists and data engineers in less than two years while massive tech companies often struggle to do the same.

Companies have long used data to improve operational efficiency, but there's growing emphasis on using it to connect with customers in new ways to boost sales. What are some novel ways you are able to use data today?

Ian Swanson: We have subscription e-commerce companies [as clients] that do over a billion in business [annually]. We can identify the [customers] who are at risk of leaving next month, what their lifetime potential is and how to retain them. Lifetime value is a core thing we work on with customers. ... Many companies use crazy Excel math to try to figure this out. We look at [many attributes and features] at a granular level to determine the lifetime value of a customer -- attributes we can act on to grow customers by targeting specific advertising practices. ... "Let's understand your audience at a deep level."

DataScience CEO and founder Ian SwansonIan Swanson

How important is data monetization to your clients?

Swanson: Some companies look at it as black and white -- how do we sell data? The way we look at it is: how do we use data to grow revenue?

We work with a connected technology company to figure out how their customers actually use their product. Companies might use focus groups to get a sense of how to market themselves, ... but the elements we are able to pull together show why and how customers use their products. We look at all their data and employ multiple data science techniques [to figure out how most people use the product] and determine how to market the product. We make recommendations to help customer support teams convert customers from a detractor into a promoter using the levers our clients can control.

We can also predict supply and demand for a product that hasn't even launched yet, so a company can include it in their financial forecast. This is often done by companies using pie-in-the-sky logic, but we apply science to do it.

The companies that use data science as a service -- do they typically have any data scientists on staff?

Swanson: Yeah, all of our clients have at least one. Fortune 500 companies have great data science teams, but they might [not be focused on] marketing and customer service and human resources. ... Internal data science teams often don't have the expertise or the capacity [to do it all]. We are a team of 75 people, and 70 are data engineers or data scientists.

We get over 1,000 resumes a month from people who want data science jobs.
Ian SwansonCEO and founder, DataScience

We are heavy on that side of the house, but we've been building [intellectual property] as well, figuring out the problems people need solved. I told my team, any tool you need to do your job more efficiently, we will buy. In this space there's a smattering of disjointed tools -- for connecting, cleaning, exploring, data wrangling, modeling -- that don't work well together. We have been using [our own] tools in house, in production, and we will package them and provide them to clients to use, so they can be using the same tools that we do. (DataScience plans to roll out its packaged tools shortly. The company declined to provide additional details.)

What type of infrastructure do you use to support all the data crunching you do?

Swanson: We are very heavy Amazon [Web Services] users -- but our technology works across Azure, too, so we aren't stuck there. In terms of data science tools, if you think of the path of a data analyst, they work with R or Python or Scala. We are about 5% R, and heavy Python users, but also on the cutting edge of Scala and Spark. We build real predictive models.

Data science as a service's success depends on companies trusting a third party with their most precious commodity -- their data. How do you overcome that trust issue and data security concerns?

Swanson: Data privacy and security is incredibly important to us, and we don't necessarily need personally identifiable information. I don't need to know that a customer's name is Joe Smith, for instance -- I may only need his user ID. So, we can work with large public companies. ...

We have passed the [data privacy and security] test before, by a team of 90 people who vetted us at American Express. (American Express acquired Swanson's virtual currency company, Sometrics, in 2011.) We haven't had one customer turn us down because of data security concerns.

You are competing with so many companies to hire data scientists, and the talent pool is shallow. How have you been able to get so many on your team?

Swanson: We raised $30 million [in venture funds] over the past year and a half, and when I was talking to VC firms, they all said we were crazy, we'd never be able to hire [the right people]. We proved in three months that we could scale the business and grow. Now we get over 1,000 resumes a month from people who want data science jobs.

We have created an environment that tugs at the heartstrings of people in this space; ... they feel that this is a think tank -- a think tank that makes money by solving problems. And they appreciate that data science is all we do, so they know they won't be among three people in a basement who aren't supported by the engineering team. ... We are growing at a rate of 10 people per month.

We also do things like fly in speakers, we host events ... and we started DS12 -- a 12-week residency data science program [for students]. It's a real curriculum that isn't about entry-level data science, and we don't charge for it; we pay for their housing, they get a small salary. We will open the curriculum to other companies so they can learn from it, too. That's about adding value [to the data science community].

Massive tech companies like IBM and Microsoft emphasize big data analysis, and they are looking to hire data science experts. I could imagine someone like that acquiring your firm, to get instant access to the talent. Have you been approached by a major tech company about an acquisition?

Swanson: Many -- yeah.

I have to be coy how I say this -- the big companies have been knocking on our door, and we have said, no, and the reason why is that there are a lot of logos in this space, but not many have figured out how to add value. [DataScience] could become a massive company. The five-year vision is we want to be the thought leader in the research, the education, the service work and the intellectual property, and that [combination] doesn't exist today.

A Ph.D. isn't a prerequisite to be a data scientist -- you need to be a specialist in certain techniques, certain lines of business, so our mix of talent is unique, and that's of interest to companies.

Meanwhile, there are so many self-service analytics tools to help business people connect the dots and make better business decisions without the aid of data experts. What do companies need data scientists for?

Swanson: Tools like Tableau or Domo are good for visualization and general knowledge, but not for making decisions that impact the future of the company; they are a window back in time, but not a window into the future. They give an overview of the current health, but not the company's future health.

We use models to do things like predict customer churn with 95% accuracy. ... Some customers wonder, is that [percentage] real? But think about how people make decisions now. They do it in a boardroom, looking at Excel sheets. We say, let's apply some science to the process. It's another weapon at the table -- combined with your gut and your experience. 

Next Steps

Guide to telling data stories

Data scientist shortage thwarts IoT projects

Steps to build a successful data science practice

Dig Deeper on Business intelligence team