DACO Seminar

×

Modal title

Modal content

Please subscribe here if you would you like to be notified about these presentations via e-mail. Moreover you can subscribe to the iCal/ics Calender.

Spring Semester 2020

Date / Time Speaker Title Location
25 March 2020
16:00-17:00
Prof. Dr. Jonathan Niles-Weed
NYU
Event Details

DACO Seminar

Title Matrix Concentration for Products
Speaker, Affiliation Prof. Dr. Jonathan Niles-Weed, NYU
Date, Time 25 March 2020, 16:00-17:00
Location Zoom Meeting (recording available)
Abstract We develop nonasymptotic concentration bounds for products of independent random matrices. Such products arise in the study of stochastic algorithms, linear dynamical systems, and random walks on groups. Our bounds exactly match those available for scalar random variables and continue the program, initiated by Ahlswede-Winter and Tropp, of extending familiar concentration bounds to the noncommutative setting. Our proof technique relies on geometric properties of the Schatten trace class. Joint work with D. Huang, J. A. Tropp, and R. Ward.
Matrix Concentration for Productsread_more
Zoom Meeting (recording available)
1 April 2020
16:00-17:00
Prof. Dr. Weinan E
Princeton
Event Details

DACO Seminar

Title A Mathematical Perspective of Machine Learning
Speaker, Affiliation Prof. Dr. Weinan E, Princeton
Date, Time 1 April 2020, 16:00-17:00
Location Zoom Meeting
Abstract The heart of modern machine learning is the approximation of high dimensional functions. Traditional approaches, such as approximation by piecewise polynomials, wavelets, or other linear combinations of fixed basis functions, suffer from the curse of dimensionality. We will discuss representations and approximations that overcome this difficulty, as well as gradient flows that can be used to find the optimal approximation. We will see that at the continuous level, machine learning can be formulated as a series of reasonably nice variational and PDE-like problems. Modern machine learning models/algorithms, such as the random feature and shallow/deep neural network models, can be viewed as special discretizations of such continuous problems. At the theoretical level, we will present a framework that is suited for analyzing machine learning models and algorithms in high dimension, and present results that are free of the curse of dimensionality. Finally, we will discuss the fundamental reasons that are responsible for the success of modern machine learning, as well as the subtleties and mysteries that still remain to be understood.
A Mathematical Perspective of Machine Learningread_more
Zoom Meeting
8 April 2020
16:00-17:00
Prof. Dr. Philippe Rigollet
Massachusetts Institute of Technology (MIT), Cambridge, USA
Event Details

DACO Seminar

Title Statistical and Computational aspects of Wasserstein Barycenters
Speaker, Affiliation Prof. Dr. Philippe Rigollet, Massachusetts Institute of Technology (MIT), Cambridge, USA
Date, Time 8 April 2020, 16:00-17:00
Location Zoom Meeting
Abstract The notion of average is central to most statistical methods. In this talk we study a generalization of this notion over the non-Euclidean space of probability measures equipped with a certain Wasserstein distance. This generalization is often called Wasserstein Barycenters and empirical evidence suggests that these barycenters allow to capture interesting notions of averages in graphics, data assimilation and morphometrics. However the statistical (rates of convergence) and computational (efficient algorithms) for these Wasserstein barycenters are largely unexplored. The goal of this talk is to review two recent results: 1. Fast rates of convergence for empirical barycenters in general geodesic spaces, and, 2. Provable guarantees for gradient descent and stochastic gradient descent to compute Wasserstein barycenters. Both results leverage geometric aspects of optimal transport. Based on joint works (arXiv:1908.00828, arXiv:2001.01700) with Chewi, Le Gouic, Maunu, Paris, and Stromme.
Statistical and Computational aspects of Wasserstein Barycentersread_more
Zoom Meeting
22 April 2020
16:00-17:00
Prof. Dr. David Gamarnik
MIT
Event Details

DACO Seminar

Title Overlap Gap Property: a Provable Barrier to Fast Optimization in Probabilistic Combinatorial Structures
Speaker, Affiliation Prof. Dr. David Gamarnik, MIT
Date, Time 22 April 2020, 16:00-17:00
Location Zoom Meeting
Abstract Many combinatorial optimization problems defined on random instances exhibit an apparent gap between the optimal values, which can be computed by non-constructive means, and the best values achievable by fast (polynomial time) algorithms. Through a combined effort of mathematicians, computer scientists and statistical physicists, it became apparent that a potential barrier for designing fast algorithms bridging this gap is an intricate topology of nearly optimal solutions, in particular the presence of the Overlap Gap Property (OGP), which we will introduce in this talk. We will discuss how for many such problems the onset of the OGP phase transition introduces indeed a provable barrier to a broad class of polynomial time algorithms. Examples of such problems include the problem of finding a largest independent set of a random graph, finding a largest cut in a random hypergrah, the problem of finding a ground state of a p-spin model, and also many problems in high-dimensional statistics field. In this talk we will demonstrate in particular why OGP is a barrier for three classes of algorithms designed to find a near ground state in p-spin models arising in the field of spin glass theory: Approximate Message Passing algorithms, algorithms based on low-degree polynomial and Langevin dynamics. Joint work with Aukosh Jagannath and Alex Wein
Overlap Gap Property: a Provable Barrier to Fast Optimization in Probabilistic Combinatorial Structuresread_more
Zoom Meeting
29 April 2020
16:00-17:00
Prof. em. Dr. Sara van de Geer
ETH Zurich, Switzerland
Event Details

DACO Seminar

Title Total variation regularization
Speaker, Affiliation Prof. em. Dr. Sara van de Geer, ETH Zurich, Switzerland
Date, Time 29 April 2020, 16:00-17:00
Location Zoom Meeting
Abstract Let Y be a n-dimensional vector of independent observations with unknown mean f^0 := E Y. We consider the estimator f_D that solves the ``analysis problem" min_f {|| Y - f ||_2^2/ n + 2 lambda || D f ||_1 } , where D is a given , m times n matrix and lambda>0 is a tuning parameter. An example for the matrix D is the (first order) difference operator (Df )_i= f_{i}- f_{i-1} , \ i \in [2:n] in which case || Df ||_1 = TV(f) is the (first order) total variation of the vector f. Other examples include higher order discrete derivatives, total variation on graphs and total variation in higher dimensions. Our aim is to show that the estimator f_D is adaptive. For example, when f^0 is a piecewise linear function, we show that the analysis estimator f_D, with D the second order differences operator, adapts to the number of kinks of f^0. As is the case with the Lasso, the theory for the analysis estimator f_D requires a form of "restricted eigenvalue" condition. We will show that this can be established using interpolating vectors. We will illustrate this (with drawings) for the various examples.
Total variation regularizationread_more
Zoom Meeting
13 May 2020
20:00-21:00
Prof. Dr. Emmanuel Candès
Stanford University, Stanford, USA
Event Details

DACO Seminar

Title Reliable predictions? Equitable treatment? Some recent progress in predictive inference
Speaker, Affiliation Prof. Dr. Emmanuel Candès, Stanford University, Stanford, USA
Date, Time 13 May 2020, 20:00-21:00
Location Zoom meeting
Abstract Recent progress in machine learning (ML) provides us with many potentially effective tools to learn from datasets of ever increasing sizes and make useful predictions. How do we know that these tools can be trusted in critical and high-sensitivity systems? If a learning algorithm predicts the GPA of a prospective college applicant, what guarantees do I have concerning the accuracy of this prediction? How do we know that it is not biased against certain groups of applicants? This talk introduces statistical ideas to ensure that the learned models satisfy some crucial properties, especially reliability and fairness (in the sense that the models need to apply to individuals in an equitable manner). To achieve these important objectives, we shall not ‘open up the black box’ and try understanding its underpinnings. Rather we discuss broad methodologies — conformal inference, quantile regression, the Jackknife+ — that can be wrapped around any black box as to produce results that can be trusted and are equitable.
Reliable predictions? Equitable treatment? Some recent progress in predictive inferenceread_more
Zoom meeting
20 May 2020
16:00-17:00
Prof. Dr. Francis Bach
INRIA/ENS, Paris, France
Event Details

DACO Seminar

Title On the effectiveness of Richardson Extrapolation in Machine Learning
Speaker, Affiliation Prof. Dr. Francis Bach, INRIA/ENS, Paris, France
Date, Time 20 May 2020, 16:00-17:00
Location Zoom Meeting
Abstract Richardson extrapolation is a classical technique from numerical analysis that can improve the approximation error of an estimation method by combining linearly several estimates obtained from different values of one of its hyperparameters, without the need to know in details the inner structure of the original estimation method. The main goal of this presentation is to study when Richardson extrapolation can be used within machine learning. We identify two situations where Richardson interpolation can be useful: (1) when the hyperparameter is the number of iterations of an existing iterative optimization algorithm, with applications to averaged gradient descent and Frank-Wolfe algorithms, and (2) when it is a regularization parameter, with applications to Nesterov smoothing techniques for minimizing non-smooth functions and ridge regression. In all these cases, I will show that extrapolation techniques come with no significant loss in performance, but with sometimes strong gains.
On the effectiveness of Richardson Extrapolation in Machine Learningread_more
Zoom Meeting
27 May 2020
16:00-17:00
Prof. Dr. Lenka Zdeborova
CNRS, CEA in Saclay, France.
Event Details

DACO Seminar

Title Understanding machine learning via exactly solvable statistical physics models
Speaker, Affiliation Prof. Dr. Lenka Zdeborova, CNRS, CEA in Saclay, France.
Date, Time 27 May 2020, 16:00-17:00
Location Zoom meeting
Abstract The affinity between statistical physics and machine learning has a long history, this is reflected even in the machine learning terminology that is in part adopted from physics. I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the learning algorithm.
Understanding machine learning via exactly solvable statistical physics modelsread_more
Zoom meeting
* 3 June 2020
20:00-21:00
Prof. Dr. Ingrid Daubechies
Duke University, Durham, USA
Event Details

DACO Seminar

Title Diffusion Methods in Manifold and Fibre Bundle Learning
Speaker, Affiliation Prof. Dr. Ingrid Daubechies, Duke University, Durham, USA
Date, Time 3 June 2020, 20:00-21:00
Location Zoom meeting
Abstract Diffusion methods help understand and denoise data sets; when there is additional structure (as is often the case), one can use (and get additional benefit from) a fiber bundle model.
Diffusion Methods in Manifold and Fibre Bundle Learningread_more
Zoom meeting
10 June 2020
16:00-17:00
Prof. Dr. Andrea Montanari
Stanford University, Stanford, USA
Event Details

DACO Seminar

Title Title T.B.A.
Speaker, Affiliation Prof. Dr. Andrea Montanari, Stanford University, Stanford, USA
Date, Time 10 June 2020, 16:00-17:00
Location Zoom meeting (link TBA)
Title T.B.A. (CANCELLED)
Zoom meeting (link TBA)
* 17 June 2020
20:00-21:00
Prof. Dr. Aviv Regev
MIT, Broad Institute, Cambridge, USA
Event Details

DACO Seminar

Title Design for inference and the power of random experiments in biology
Speaker, Affiliation Prof. Dr. Aviv Regev, MIT, Broad Institute, Cambridge, USA
Date, Time 17 June 2020, 20:00-21:00
Location Zoom meeting
Abstract (There was a scheduling issue due to the two time slots, the talk is at 20h Zurich time, not 16h. Sorry for the inconvenience!) https://www.broadinstitute.org/regev-lab
Design for inference and the power of random experiments in biologyread_more
Zoom meeting
24 June 2020
16:00-17:00
Prof. Dr. Andrea Montanari
Stanford University, Stanfort, USA
Event Details

DACO Seminar

Title The generalization error of overparametrized models: Insights from exact asymptotics
Speaker, Affiliation Prof. Dr. Andrea Montanari, Stanford University, Stanfort, USA
Date, Time 24 June 2020, 16:00-17:00
Location Zoom meeting
Abstract In a canonical supervised learning setting, we are given n data samples, each comprising a feature vector and a label, or response variable. We are asked to learn a function f that can predict the the label associated to a new --unseen-- feature vector. How is it possible that the model learnt from observed data generalizes to new points? Classical learning theory assumes that data points are drawn i.i.d. from a common distribution and argue that this phenomenon is a consequence of uniform convergence: the training error is close to its expectation uniformly over all models in a certain class. Modern deep learning systems appear to defy this viewpoint: they achieve training error that is significantly smaller than the test error, and yet generalize well to new data. I will present a sequence of high-dimensional examples in which this phenomenon can be understood in detail. [Based on joint work with Song Mei, Feng Ruan, Youngtak Sohn, Jun Yan]
The generalization error of overparametrized models: Insights from exact asymptoticsread_more
Zoom meeting
8 July 2020
16:00-17:00
Prof. Dr. Mahdi Soltanolkotabi
University of Southern California, USA
Event Details

DACO Seminar

Title Learning via early stopping and untrained neural nets
Speaker, Affiliation Prof. Dr. Mahdi Soltanolkotabi, University of Southern California, USA
Date, Time 8 July 2020, 16:00-17:00
Location Zoom Meeting
Abstract Modern neural networks are typically trained in an over-parameterized regime where the parameters of the model far exceed the size of the training data. Such neural networks in principle have the capacity to (over)fit any set of labels including significantly corrupted ones. Despite this (over)fitting capacity, over-parameterized networks have an intriguing robustness capability: they are surprisingly robust to label noise when first order methods with early stopping are used to train them. Even more surprising, one can remove noise and corruption from a natural image without using any training data what-so-ever, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to a single corrupted image. In this talk I will first present theoretical results aimed at explaining the robustness capability of neural networks when trained via early-stopped gradient descent. I will then present results towards demystifying untrained networks for image reconstruction/restoration tasks such as denoising and those arising in inverse problems such as compressive sensing.
Learning via early stopping and untrained neural netsread_more
Zoom Meeting
15 July 2020
16:00-17:00
Prof. Dr. Samory Kpotufe
Columbia University, New York, USA
Event Details

DACO Seminar

Title Some Recent Insights on Transfer-Learning
Speaker, Affiliation Prof. Dr. Samory Kpotufe, Columbia University, New York, USA
Date, Time 15 July 2020, 16:00-17:00
Location Zoom Meeting
Abstract A common situation in Machine Learning is one where training data is not fully representative of a target population due to bias in the sampling mechanism or high costs in sampling the target population; in such situations, we aim to ’transfer’ relevant information from the training data (a.k.a. source data) to the target application. How much information is in the source data? How much target data should we collect if any? These are all practical questions that depend crucially on ‘how far’ the source domain is from the target. However, how to properly measure ‘distance’ between source and target domains remains largely unclear. In this talk we will argue that much of the traditional notions of ‘distance’ (e.g. KL-divergence, extensions of TV such as D_A discrepancy, density-ratios, Wasserstein distance) can yield an over-pessimistic picture of transferability. Instead, we show that some new notions of ‘relative dimension’ between source and target (which we simply term ‘transfer-exponents’) capture a continuum from easy to hard transfer. Transfer-exponents uncover a rich set of situations where transfer is possible even at fast rates, encode relative benefits of source and target samples, and have interesting implications for related problems such as multi-task or multi-source learning. In particular, in the case of multi-source learning, we will discuss (if time permits) a strong dichotomy between minimax and adaptive rates: no adaptive procedure can achieve a rate better than single source rates, although minimax (oracle) procedures can. The talk is based on earlier work with Guillaume Martinet, and ongoing work with Steve Hanneke.
Some Recent Insights on Transfer-Learningread_more
Zoom Meeting
* 22 July 2020
20:00-21:00
Dr. Ahmed El Alaoui
Stanord University, Stan
Event Details

DACO Seminar

Title Optimization of mean-field spin glass Hamiltonians
Speaker, Affiliation Dr. Ahmed El Alaoui , Stanord University, Stan
Date, Time 22 July 2020, 20:00-21:00
Location Zoom Meeting
Abstract We consider the question of computing an approximate ground state configuration of an Ising (mixed) p-spin Hamiltonian H_N from a bounded number of gradient evaluations. I will present an efficient algorithm which exploits the ultrametric structure of the superlevel sets of H_N in order to achieve an energy E_* characterized via an extended Parisi variational principle. This energy E_* is optimal when the model satisfies a `no overlap gap’ condition. At the heart of this algorithmic approach is a stochastic control problem, whose dual turns out to be the Parisi formula, thereby shedding new light on the nature of the latter. This is joint work with Andrea Montanari and Mark Sellke.
Optimization of mean-field spin glass Hamiltoniansread_more
Zoom Meeting
29 July 2020
16:00-17:00
Prof. Dr. Giulio Biroli
ENS, Paris, France
Event Details

DACO Seminar

Title On the benefit of over-parametrization and the origin of double descent curves in artificial neural networks
Speaker, Affiliation Prof. Dr. Giulio Biroli, ENS, Paris, France
Date, Time 29 July 2020, 16:00-17:00
Location Zoom Meeting
Abstract Deep neural networks have triggered a revolution in machine learning, and more generally in computer science. Understanding their remarkable performance is a key scientific challenge with many open questions. For instance, practitioners find that using massively over-parameterised networks is beneficial to learning and generalization ability. This fact goes against standard theories, and defies intuition. In this talk I will address this issue. I will first contrast standard expectations based on variance-bias trade-off to the results of numerical experiments on deep neural networks, which display a “double-descent” behavior of the test error when increasing the number of parameters instead of the traditional U-curve. I will then discuss a theory of this phenomenon based on the solution of simplified models of deep neural networks by statistical physics methods.
On the benefit of over-parametrization and the origin of double descent curves in artificial neural networksread_more
Zoom Meeting

Note: events marked with an asterisk (*) indicate that the time and/or location are different from the usual time and/or location.

Organizers: Afonso Bandeira, MaD group at NYU

JavaScript has been disabled in your browser