Sergey Nivens - Fotolia
Published: 06 Oct 2016
Recently, Capital One data scientist Brendan Herger worked on a predictive analytics project aimed at identifying potentially problematic bill totals when diners using a Capital One credit card add a tip to their tabs after the initial swipe of their card.
The goal, he said, is to help cardholders avoid accidentally leaving inappropriately large tips. The analytics application triggers a text message or email to a Capital One customer if it spots a likely discrepancy, for example, a math error that the restaurant didn't catch. The cardholder can then dispute the charge with either the restaurant or Capital One itself.
To develop the required analytical functionality, Herger and his colleagues at the banking and service company, based in McLean, Va., had to "train" the predictive model on historical data to define what inappropriate tips look like so they can be flagged during incoming transactions. On this type of analytics project, visualized data helps the data scientists monitor the model training process and make sure the conclusions generated by a model fit real-world scenarios. Herger said that if a data visualization shows a big spike in the analytics data at a given point compared to what came before or after, he knows something may be off in the model's algorithms, requiring additional development work to set it right.
For analytics teams that aren't asleep at the wheel, predictive model development is never really done. Once models are put into production, they need to be continually verified and adjusted as needed to ensure that they deliver accurate predictions over time. And there's an important role data visualization tools and techniques can play in those efforts.
Boris Savkovic, lead data scientist at analytics services provider BuildingIQ in San Mateo, Calif., said he and his team continually plot out data from buildings owned by clients to compare model-based expectations of energy use to actual historical data. The visualized data helps analysts identify outliers in the information, such as spikes in a building's power consumption caused by extreme weather events.
Once outliers are identified, they can be taken into account by the predictive modelers rather than just treating them as valid data points in training algorithms, a misstep that could reduce analytics accuracy over time. "Visualizing model predictions versus actual data helps nail down where events happen," Savkovic said.
Data visualization tools play important role in success of big data projects
Visualized data can be critical to business success
Software for visualizing data is getting more advanced and more complicated