News Stay informed about the latest enterprise technology news and product updates.

IBM cracks the code for speeding up its deep learning platform

Using more GPUs when training deep learning models doesn't always deliver faster results, but new software from IBM shows it can be done.

GPUs are a natural fit for deep learning because they can crunch through large amounts of data quickly, which is...

important when training data-hungry models. But there's a catch.

Adding more graphics processing units (GPUs) to a deep learning platform doesn't necessarily lead to faster results. While individual GPUs process data quickly, they can be slow to communicate their computations to other GPUs, which has limited the degree to which users can take advantage of multiple servers to parallelize jobs and put a cap on the scalability of deep learning models.

IBM recently took on this problem to improve scalability in deep learning and wrote code for its deep learning platform to improve communication between GPUs.

"The rate at which [GPUs] update each other significantly affects your ability to scale deep learning," said Hillery Hunter, director of systems acceleration and memory at IBM. "We feel like deep learning has been held back because of these long wait times."

Hunter's team wrote new software and algorithms to optimize communication between GPUs spread across multiple servers. The team used the algorithm to train an image-recognition neural network on 7.5 million images from the ImageNet-22k data set in seven hours. This is a new speed record for training neural networks on the image data set, breaking the previous mark of 10 days, which was held by Microsoft, IBM said.

Hunter said it's essential to speed up training times in deep learning projects. Unlike virtually every other area of computing today, training deep learning models can take days, which might discourage more casual users.

"We feel it's necessary to bring the wait times down," Hunter said.

IBM is rolling out the new functionality in its PowerAI software, a deep learning platform that pulls together and configures popular open source machine learning software, including Caffe, Torch and Tensorflow. PowerAI is available on IBM's Power Systems line of servers.

But the main reason to take note of the news, according to Forrester analyst Mike Gualtieri, is the GPU optimization software might bring new functionality to existing tools -- namely Watson.

"I think the main significance of this is that IBM can bring deep learning to Watson," he said.

Watson currently has API connectors for users to do deep learning in specific areas, including translation, speech to text and text to speech. But its deep learning offerings are prescribed. By opening up Watson to open source deep learning platforms, its strength in answering natural-language queries could be applied to deeper questions.

Next Steps

Data prep can be a big hurdle in deep learning projects

Deep learning helps users make sense of advanced analytics

Embedded analytics stands to reap big rewards from deep learning

Dig Deeper on Advanced analytics software

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How important is GPU optimization for improving training of deep learning platforms?
Cancel

-ADS BY GOOGLE

SearchDataManagement

SearchAWS

SearchContentManagement

SearchCRM

SearchOracle

SearchSAP

SearchSQLServer

SearchSalesforce

Close