Events

Past Event

An Integrated Approach for Efficient Neural Network Design, Training, and Inference

April 6, 2020
10:20 AM - 11:20 AM
America/New_York
Schapiro CEPSR, 530 W. 120 St., New York, NY 10027 750 SCHAPIRO CEPSR
Computer Science Faculty Recruiting Colloquium Amir Gholami (UC Berkeley) ABSTRACT: One of the main challenges in designing, training, and implementing Neural Networks is their high demand for computational and memory resources. Designing a model for a new task requires searching through an exponentially large space to find the right architecture, which requires multiple training runs on a large dataset. This has a prohibitive computational cost, as training each candidate architecture often requires millions of iterations. Even after the right architecture with good accuracy is found, implementing it on a target hardware platform to meet latency and power constraints is not straightforward. I will present a framework that efficiently utilizes reduced-precision computing to address the above challenges by considering the full stack of designing, training, and implementing the model on a target hardware platform. This is achieved through careful analysis of the numerical instabilities associated with reduced-precision matrix operations, incorporation of a novel second-order, mixed-precision quantization approach, and a framework for hardware aware neural network design. BIO: Amir Gholami is a postdoctoral research fellow in BAIR Lab at UC Berkeley. His thesis on large-scale 3D image segmentation won UT Austin’s best doctoral dissertation award in 2018. He is a Melosh Medal finalist, recipient of Best Student Paper award in SuperComputing 2017, Gold Medal in the ACM Student Research Competition 2015, and the best student paper finalist in SuperComputing 2014. His current research includes Systems for Machine Learning, low precision training/inference, and large-scale distributed memory training of Neural Networks.

Contact Information

Vishal Misra