Paul Bernays Lectures 2022
The 2022 Paul Bernays Lectures on the topic "The quest for mathematical understanding of artificial intelligence" were delivered by Professor Sanjeev Arora.
Professor Sanjeev Arora
Sanjeev Arora is Charles C. Fitzmorris Professor of Computer Science at Princeton University. He has received Packard Fellowship (1997), Simons Investigator Award (2012), Gödel Prize (2001 and 2010), ACM Prize in Computing (2012), and the Fulkerson Prize (2012). He is a Member of National Academy of Science and a Fellow of the ACM and the AAAS.
Video recordings
The recordings of all lectures are available on the ETH video portal.
The quest for mathematical understanding of artificial intelligence
Lecture 1: Tour d’Horizon of Artificial Intelligence and Machine Learning today
- Date: Wednesday, 31 August 2022
- Time: 5.00 p.m.
- Auditorium: HG E 7
Abstract
Can machines acquire capabilities that remind us of (or even exceed) general-purpose intelligent reasoning in humans? This question has animated research in computers since their invention in the first half of the 20th century. The past decade has seen dramatic progress in this direction, thanks very large deep neural network models, as well as a new generations of net architectures, algorithms, and training data sets. Machines have achieved super-human performance in a range of tasks. The lecture gives a broad and nontechnical overview of these new developments as well as questions (scientific, societal, ethical) raised by them.
Lecture 2: Deep Learning: Attempts toward mathematical understanding
- Date: Thursday, 1 September 2022
- Time: 5.00 p.m.
- Auditorium: HG E 7
Abstract
As described in Lecture 1, Deep Learning underlies many dramatic advances of the past decade. It involves training a massive neural net (aka deep net)—with billions or even trillions of trainable parameters—on very large datasets. Much of this field is empirical, and guided by good intuition. But there is also an emerging set of mathematical ideas for understanding the training process as well as properties of the trained nets. The lecture will give brief introductions to the frameworks of optimization, generalization, student-teacher setting, infinitely wide nets (NTK regime), and unsupervised learning. This lecture assumes some comfort with basic linear algebra, calculus and probability. It will allow nonexperts and students some mathematical insight into the field, the phenomena it is trying to understand, and some insights so far.
Lecture 3: What do we not understand mathematically about deep learning?
- Date: Friday, 30 September 2022
- Time: 5.15 p.m.
- Auditorium: HG E 7
Abstract
Deep learning has up-ended many traditional ways of thinking about machine learning and artificial intelligence. This lecture seeks to convey the challenges it poses for our prior mathematical understanding of machine learning, while building upon the vocabulary and concepts of Lecture 2. The focus is on attempts to cast light on the various mysteries of deep learning. Of special relevance is the startling ability of today’s very large deep nets to be quickly adapted to new tasks, which calls for quantifying the “skills” and “concepts” captured in the net’s parameters, something that remains nebulous from a mathematical viewpoint. We survey a growing number of attempts at mathematically peering into how the net evolves during training. Though focused on results from the past 3-4 years, the lecture should still be accessible to a broad scientific audience with math training at an undergraduate level.