Machine Learning in HPC Environments

Workshop Program

Location: Hilton Salon D

Date: Sunday November 15, 2015

Time: 2:00pm - 5:30pm

Keynote: Bryan Catanzaro
2:00pm - 3:00pm

Baidu Research Silicon Valley Artificial Intelligence Laboratory

During the past few years, deep learning has made incredible progress towards solving many previously difficult Artificial Intelligence (AI) tasks. Although the techniques behind deep learning have been studied for decades, they rely on large datasets and large computational resources, and so have only recently become practical for many problems. Training deep neural networks is very computationally intensive: training one model takes tens of exaflops of work, and so HPC techniques are key to creating these models. As in other fields, progress in AI is iterative, building on previous ideas. This means that the turnaround time in training models is a key bottleneck to progress in AI—the quicker an idea can be realized as a trainable model, train it on a large dataset, and test it, the quicker that ways can be found of improving the models. In this talk, Catanzaro will discuss the key insights that make deep learning work for many problems, describe the training problem, and detail the use of standard HPC techniques that allow him to rapidly iterate on his models. He will explain how HPC ideas are becoming increasingly central to progress in AI and will also show several examples of how deep learning is helping solve difficult AI problems.

Coffee Break
3:00pm - 3:30pm

Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms
3:30pm - 3:55pm

Janis Keuper and Franz-Josef Pfreundt

The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of convergence and accuracy. Recently, several parallelization approaches have been proposed in order to scale SGD to solve very large ML problems. At their core, most of these approaches are following a MapReduce scheme. This paper presents a novel parallel updating algorithm for SGD, which utilizes the asynchronous single-sided communication paradigm. Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent (ASGD) provides faster convergence, at linear scalability and stable accuracy.

[Presentation] [Paper]

HPDBSCAN – Highly Parallel DBSCAN
3:55pm - 4:20pm

Markus Götz, Christian Bodenstein and Morris Riedel

Clustering algorithms in the field of data-mining are used to aggregate similar objects into common groups. One of the best-known of these algorithms is called DBSCAN. Its distinct design enables the search for an apriori unknown number of arbitrarily shaped clusters, and at the same time allows to filter out noise. Due to its sequential formulation, the parallelization of DBSCAN renders a challenge. In this paper we present a new parallel approach which we call HPDBSCAN. It employs three major techniques in order to break the sequentiality, empower workload-balancing as well as speed up neighborhood searches in distributed parallel processing environments i) a computation split heuristic for domain decomposition, ii) a data index preprocessing step and iii) a rule-based cluster merging scheme. As a proof-of-concept we implemented HPDBSCAN as an OpenMP/MPI hybrid application. Using real-world data sets, such as a point cloud from the old town of Bremen, Germany, we demonstrate that our implementation is able to achieve a significant speed-up and scale-up in common HPC setups. Moreover, we compare our approach with previous attempts to parallelize DBSCAN showing an order of magnitude improvement in terms of computation time and memory consumption.

[Presentation] [Paper]

LBANN: Livermore Big Artificial Neural Network HPC Toolkit
4:20pm - 4:45

Brian Van Essen, Hyojin Kim, Roger Pearce, Kofi Boakye and Barry Chen

Recent successes of deep learning have been largely driven by the ability to train large models on vast amounts of data. We believe that High Performance Computing (HPC) will play an increasingly important role in helping deep learning achieve the next level of innovation fueled by neural network models that are orders of magnitude larger and trained on commensurately more training data. We are targeting the unique capabilities of both current and upcoming HPC sys- tems to train massive neural networks and are developing the Livermore Big Artificial Neural Network (LBANN) toolkit to exploit both model and data parallelism optimized for large scale HPC resources. This paper presents our prelimi- nary results in scaling the size of model that can be trained with the LBANN toolkit.

[Presentation] [Paper]

Optimizing Deep Learning Hyper-Parameters Through an Evolutionary Algorithm
4:45pm - 5:10pm

Steven Young, Derek Rose, Thomas Karnowski, Seung-Hwan Lim and Robert Patton

There has been a recent surge of success in utilizing Deep Learning (DL) in imaging and speech applications for its relatively automatic feature generation and, in particular for convolutional neural networks (CNNs), high accuracy classification abilities. While these models learn their parameters through data-driven methods, model selection (as architecture construction) through hyper-parameter choices remains a tedious and highly intuition driven task. To address this, Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL) is proposed as a method for automating network selection on computational clusters through hyper-parameter optimization performed via genetic algorithms.

[Presentation] [Paper]

Dynamic Adaptive Neural Network Arrays: A Neuromorphic Architecture
5:10pm - 5:30pm

Catherine Schuman, Adam Disney and John Reynolds

Dynamic Adaptive Neural Network Array (DANNA) is a neuromorphic hardware implementation. It differs from most other neuromorphic projects in that it allows for programmability of structure, and it is trained or designed using evo- lutionary optimization. This paper describes the DANNA structure, how DANNA is trained using evolutionary optimization, and an application of DANNA to a very simple classification task.

[Presentation] [Paper]

Workshop Program

Keynote: Bryan Catanzaro 2:00pm - 3:00pm

Coffee Break 3:00pm - 3:30pm

Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms 3:30pm - 3:55pm

HPDBSCAN – Highly Parallel DBSCAN 3:55pm - 4:20pm

LBANN: Livermore Big Artificial Neural Network HPC Toolkit 4:20pm - 4:45

Optimizing Deep Learning Hyper-Parameters Through an Evolutionary Algorithm 4:45pm - 5:10pm

Dynamic Adaptive Neural Network Arrays: A Neuromorphic Architecture 5:10pm - 5:30pm

Keynote: Bryan Catanzaro
2:00pm - 3:00pm

Coffee Break
3:00pm - 3:30pm

Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms
3:30pm - 3:55pm

HPDBSCAN – Highly Parallel DBSCAN
3:55pm - 4:20pm

LBANN: Livermore Big Artificial Neural Network HPC Toolkit
4:20pm - 4:45

Optimizing Deep Learning Hyper-Parameters Through an Evolutionary Algorithm
4:45pm - 5:10pm

Dynamic Adaptive Neural Network Arrays: A Neuromorphic Architecture
5:10pm - 5:30pm