Baidu Research Silicon Valley Artificial Intelligence Laboratory
During the past few years, deep learning has made incredible progress towards solving many previously difficult Artificial Intelligence (AI) tasks. Although the techniques behind deep learning have been studied for decades, they rely on large datasets and large computational resources, and so have only recently become practical for many problems. Training deep neural networks is very computationally intensive: training one model takes tens of exaflops of work, and so HPC techniques are key to creating these models. As in other fields, progress in AI is iterative, building on previous ideas. This means that the turnaround time in training models is a key bottleneck to progress in AI—the quicker an idea can be realized as a trainable model, train it on a large dataset, and test it, the quicker that ways can be found of improving the models. In this talk, Catanzaro will discuss the key insights that make deep learning work for many problems, describe the training problem, and detail the use of standard HPC techniques that allow him to rapidly iterate on his models. He will explain how HPC ideas are becoming increasingly central to progress in AI and will also show several examples of how deep learning is helping solve difficult AI problems.