Sergey Nivens - Fotolia

News Stay informed about the latest enterprise technology news and product updates.

How The New York Times uses predictive analytics algorithms

Most print media companies have struggled to make money in the 21st century, but The New York Times is using predictive analytics tools to gain a competitive edge.

In the middle part of the last decade, when the Internet replaced print publications as the primary source of news for many people, revenue at most news organizations plummeted. Advertisers were less willing to pay high rates for space in print newspapers and online ads were less proven. This left news organizations scrambling.

Many still have not adjusted to the new business of news. But The New York Times, for one, is starting to make predictive analytics a major part of its business model in an effort to adjust to the modern realities. From trying to get more people to subscribe to promoting articles on social media, the news organization is letting predictive models guide many of its business decisions, and it's hoping this approach will make it as successful in the 21st century as it was in the last.

In a presentation at the Predictive Analytics World conference in Boston, the Times' chief data scientist, Chris Wiggins, talked about how he and his team use predictive analytics algorithms to do things such as funnel analysis to see how people become subscribers, and how to influence more to do so. They also use natural language processing to understand content topics that generate the most reader engagement, so marketing teams can know what types of articles to promote.

An outsider steps into the news

Wiggins may seem like a strange choice to lead the data operations at a newspaper. He has a Ph.D. in theoretical physics and has spent most of his career in academia doing biological research. But most of his research has taken advantage of machine learning and other advanced statistical methods. He said applying these types of predictive analytics models to the traditional business of newspapers is not so different than using them in biology, which historically has not been an exceptionally data-driven field of science.

Data is more and more getting recognized at The New York Times as a first-class citizen.
Chris Wigginschief data scientist at The New York Times

For someone who has worked for years in higher education, Wiggins takes a decidedly unacademic approach to his work at the Times. He said he makes his team avoid projects that have only theoretical business value and instead focus on things that are clearly useful.

"It should be clear to everyone in the company why something we're doing is valuable to the company," Wiggins said. "You should only do things that are actionable."

To get to this point, Wiggins has built a team that leans more toward general computer science skills, rather than statistics. He said this tack is helpful in taking a model from development to production. Having people who know programming means they can build an app or Web portal more easily than some data scientists. They use Python for most projects, which is generally more programming-intensive, rather than a stats-centric predictive analytics tool like R.

"[Python] draws in more people that skew more computer science, but it also ensures when we're done with something, it doesn't die a cold death as a slide deck," Wiggins said.

Data does not make editorial decisions

Even as the Times scores some wins with predictive analytics algorithms, there is one area Wiggins said his team will never infiltrate: the editorial department. He acknowledged that there are lots of other news organizations out there that use analytics to drive editorial decisions, but he said it's important to know when to take a step back. Right now, the quality of the paper's editorial judgment is the primary thing that sets it apart from many competitors. It's hard to see how that aspect could improve by making it more data-driven.

But in most other areas, analytics is helping the Times become a 21st century news organization. Wiggins said the key is leading a change in mind-set that makes people look to data first to answer their questions.

"Data is more and more getting recognized at The New York Times as a first-class citizen," he said.

Ed Burns is site editor of SearchBusinessAnalytics. Email him at and follow him on Twitter: @EdBurnsTT.

Next Steps

Get the team involved when building predictive analytics models

Big data adds a new wrinkle to predictive modeling

Speed up predictive model building for best results

Why you should care about the growing algorithm economy

Dig Deeper on Predictive analytics

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What industries do you think are ripe for improvements through predictive analytics algorithms?
There has already been developments in healthcare using predictive analytics (Four Use Cases for Healthcare Predictive Analytics, Big Data).  However, I believe there is an almost limitless supply of interesting and impactful problems in the healthcare industry that could benefit from predictive analytics work.  

There is also a similarity between the editorial freedom mentioned above and the medical field.  Doctors will never be replaced by an algorithm - but there can be tools developed to help diagnose, provide aftercare, reduce costs, research diseases, etc.
B2B and B2C sales lead generation is ripe for improvement through predictive algorithms.    
The ability to accurately predict suspects, prospects, leads and influencers dramatically reduces time and effort associated with prospecting. By augmenting the prospecting data with predictive insights gleaned from - weak signals, incidental similarities, buyer intent and propensity to influence - sales teams will gain a competitive advantage with a measurable RoI.       
I definitely agree with Sulscott. As much has been done in healthcare, there still seems to be a number of untapped data sources that could be game changers. 

Lead gen is an interesting problem fro analytics too. I know some businesses are making progress there, but kevinneary is right that it's still not ubiquitous. 
The company’s publisher, Art Sulzberger Jr., so far is showing some confidence in his own investor-owned company this year. Though he is not buying any of his company’s stock, he so far, at least, has not sold any NYT stock this year. In the past, he has treated the stock like his own personal piggy bank. In 2014, for example, he sold almost $2 million of the stock. I don't know what the predictive software says about that.
I agree with Kevinnery from the sales point of view. Companies who spent a lot in Media campaign prospective data can give them better ROI specially B2C sales.

Big data is not answer to every problem. However, when there is nothing to back, it can bring clarity.

From my perspective, a notable factor driving the trend is the nature of the data: it's naturalistic, relatively comprehensive, and much of it is social (broadly defined to include link sharing, traffic patterns, etc. in addition to the more obvious like social network data). These attributes are hugely valuable to a wide variety of scientific and business uses.

There are many (social) scientific disciplines like sociology, epidemiology, linguistics, and psychology, that previously had to generate their own data or make due with sparse data (both in terms of breadth and depth). Issues like generalizability become almost moot when you can sample such large percentages of the population. In fact much of inferential/experimental statistics is dedicated to this issue: how much can I trust this to be true?

You're pretty darn certain if you've sampled everybody. (Though the large N places a real emphasis on effect size rather than statistical significance.)

For an article that starts off with How.... I found it a little 'lacking' in some examples of how the NYT actually found value with it.  It seems sort of 'meta' to me.

But if you want to know more about the extent to which its used, this article seems to give service to the fact that it is. Hrms... I wonder how others have used predictive analytics to make better decisions.