Events

Past Event

Statistical Machine Learning Bootcamp

January 14, 2020 - January 16, 2020
9:00 AM - 5:00 PM
Department of Computer Science, 500 W. 120th St., New York, New York 10027 451
The goal of the Columbia Year of Statistical Machine Learning Bootcamp Lectures is to introduce students to the computational, mathematical, and statistical foundations of data science. The focus will be on theoretical subjects of interest in modern statistical machine learning, suitable for new Ph.D. students in computer science, statistics, applied math, and related fields. The lectures are open (free) to all, but we kindly request that you complete the following registration form so we get an accurate headcount. Registration: https://forms.gle/dHB5Hbq4GB43eJuJ8 Schedule: Lectures are in the CS Auditorium (451 Computer Science Building). The 10:15am-11:00am coffee breaks will be in the CS Lounge (also in the Computer Science Building). Tuesday, January 14 9:15-10:15: Concentration of measure 10:15-11:00: Coffee break (CS Lounge) 11:00-12:00: Concentration of measure 12:00-2:00: Lunch break (on your own) 2:00-3:00: Concentration of measure 3:00-3:30: Break 3:30-4:30: Algorithmic applications of high-dimensional geometry Wednesday, January 15 9:15-10:15: Algorithmic applications of high-dimensional geometry 10:15-11:00: Coffee break (CS Lounge) 11:00-12:00: Algorithmic applications of high-dimensional geometry 12:00-2:00: Lunch break (on your own) 2:00-3:00: Optimal transport 3:00-3:30: Break 3:30-4:30: Optimal transport Thursday, January 16 9:15-10:15: Stochastic gradient methods 10:15-11:00: Coffee break (CS Lounge) 11:00-12:00: Stochastic gradient methods 12:00-2:00: Lunch break (on your own) 2:00-3:00: Stochastic gradient methods 3:00-3:30: Break 3:30-4:30: Optimal transport Lecturers and Topics: Jarek Błasiok : Concentration of measure (1) Equivalence between moment bounds/MGF bounds/tail bounds, Khintchine inequality, Bernstein inequality, Johnson-Lindenstrauss for Gaussian matrices. (2) Subspace embedding: net argument, the volumetric argument for net constructions. (3) Concentration inequalities for low-influence functions. Alex Andoni : Algorithmic applications of high-dimensional geometry Many modern algorithms, especially for massive datasets, benefit from geometric techniques and tools even though the initial problem might have nothing to do with geometry. In this lecture series, we will cover a number of examples where (high-dimensional) geometry techniques lead to algorithms with significantly improved parameters, such as run-time, space, communication, etc. For example, starting with the classic dimension reduction method, researchers developed powerful tools for storing, transmitting, and accessing data quantums more efficiently than merely storing/etc the full data. These tools can be seen as a form of functional compression, where we store just enough information about data pieces to be useful for particular tasks. We will see applications of these tools to problems such as similarity search/nearest neighbor search, and numerical linear algebra. Espen Bernton and Bodhi Sen : Optimal transport Details to come. Arian Maleki : Stochastic gradient methods Details to come.

Contact Information

Daniel Hsu