News Stay informed about the latest enterprise technology news and product updates.

Developing AI apps free from bias crucial to avoid analytics errors

Biased data samples or model development practices can derail any company interested in using AI tools and diminish the technology's return on investment.

When you talk about artificial intelligence, you have to talk about biases and how they might affect a model.

Biases can affect an enterprise's use of AI in two distinct ways. The first way relates to the effectiveness of models. Maybe a data scientist has a mental model of how he or she thinks the world works, but the model turns out to be invalid. Developing AI applications around it will lead to disappointing results.

"There are all kinds of ways that AI can reflect the biases of those who collected the data, so we need to think critically about how data sets are collected," said Madeleine Clare Elish, a researcher at the Data & Society Research Institute, in a presentation at the recent O'Reilly AI Conference in New York.

Elish said that when AI is applied to areas like targeted marketing or customer service, this kind of bias is essentially an inconvenience. Models won't deliver good results, but at the end of the day, no one gets hurt.

The second type of bias, though, can be more impactful to people. Elish talked about how AI is increasingly seeping into areas like insurance, credit scoring and criminal justice. Here, biases, whether they result from unrepresentative data samples or from unconscious partialities of developers, can have much more severe effects.

Using AI to tackle one form of bias

Another area where biased AI systems can hurt people is hiring. But in this realm, AI can also be a tool to fight against biases. Lindsey Zuloaga is a data scientist at HireVue Inc., a company in South Jordan, Utah, that's looking to apply AI to reduce the impact of biases when making hiring decisions. In an interview at the conference, she said AI can help evaluate candidates in a more objective way by reducing the unconscious reliance human interviewers might have on things like tone of voice or appearance.

"I think it's important that people are judged on their merits," Zuloaga said. "You want things to be fair. But in the hiring process, things are really unfair."

The HireVue platform works by recording videos of candidates answering job interview questions on their own time. AI algorithms then evaluate candidates on predefined criteria. Businesses using the platform are asked after the fact how things are going with workers hired through it in order to sharpen recommendations over time.

Theoretically, bias could creep into this process. For example, hirers could send a feedback rating as positive only for hires who fit in an organization, which could be another way of expressing racial or gender biases.

But Zuloaga said she and other data scientists at the company try to avoid this type of situation by making the deep learning algorithms underpinning the system interpretable. Most neural networks function as a black box -- the reasons for their recommendations are unclear to users. But by engineering in explanations to these models when developing AI applications, Zuloaga and her team can go back and make sure the algorithm is only recommending candidates strictly based on their expected job performance.

"We all have these biases, so that serves as a big source of inspiration," Zuloaga said. "I think there is a lot of power in diversity just for having the power of different opinions."

Developing AI for good takes good data

Often, biased AI models aren't problematic because of anything an engineer put in the model. The problem might come from the data itself. AI and deep learning models are really good at inferring relationships between variables that may be subtle. But, in some cases, this is undesirable. For example, where a person lives can often be used as a proxy for inferring their race.

This is why making sure that model-training data is representative of the population being modeled and includes only necessary data fields is so important, said Lashon Booker, a senior principal scientist at The MITRE Corp.'s IT center in McLean, Va.

That may sound obvious, but it's actually a huge challenge, Booker said in a presentation at the conference. Since big data came in vogue a few years ago, enterprises have amassed huge troves of data. When it comes to training deep learning models, this is generally a good thing. However, the issue of removing potential sources of bias from large data sets can be difficult when you don't actually know what features in the data set the deep learning algorithm will build a model around, he noted.

Ensuring that data is collected in a way that represents the population being modeled and removing any known sources of bias right off the bat can help. "The data you have available for training might make this more challenging than you'd expect," Booker said. "Big data might not be your friend."

Next Steps

What is the difference between deep learning and machine learning?

Lines of business may need to be sold on value of AI

It takes a village to deploy artificial intelligence in the enterprise

Dig Deeper on Artificial intelligence and analytics

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How does your company prevent biases from infecting AI models?
Cancel
It is nearly impossible to remove bias from AI. Bias is inherent in all aspects of data acquisition. For example, sampling information that occurs faster than the sampling rate is susceptible to Nyquist aliasing. Even the idea of sampling events based information has complex, supporting data encoded which indicates actions and decisions driving events, again susceptible to Nyquist aliasing resulting in skewed results.

Information loss through one-way statistical methods all contribute to algorithmic bias seen as some features promoted while other inhibited. 

Then there is sampling noise/sampling errors adding its own, unique types of bias. 

Deep learning can compound the issue of bias; error, loss and noise along each instance of the processing pipelines. 

To truly understand bias, its best to return basics with the the definition of Intelligence: " the ability to discern"

Cancel

-ADS BY GOOGLE

SearchDataManagement

SearchAWS

SearchContentManagement

SearchCRM

SearchOracle

SearchSAP

SearchSQLServer

SearchSalesforce

Close