ETH-FDS seminar series

More information about ETH Foundations of Data Science can be found here

×

Modal title

Modal content

Please subscribe here if you would you like to be notified about these events via e-mail. Moreover you can also subscribe to the iCal/ics Calender.

Autumn Semester 2022

Date / Time Speaker Title Location
8 September 2022
16:15-17:15
Aaditya Ramdas
Carnegie Mellon University
Details

ETH-FDS seminar

Title Conformal prediction beyond exchangeability (= quantifying uncertainty for black-box ML without distributional assumptions)
Speaker, Affiliation Aaditya Ramdas , Carnegie Mellon University
Date, Time 8 September 2022, 16:15-17:15
Location OAS J 10
ETH AI Center, OAS, Binzmühlestrasse 13, 8050 Zürich
Abstract Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use an algorithm that treats recent observations as more relevant, which would violate the assumption that data points are treated symmetrically. This paper proposes a new methodology to deal with both aspects: we use weighted quantiles to introduce robustness against distribution drift, and design a new technique to allow for algorithms that do not treat data points symmetrically. Our algorithms are provably robust, with substantially less loss of coverage when exchangeability is violated due to distribution drift or other challenging features of real data, while also achieving the same algorithm and coverage guarantees as existing conformal prediction methods if the data points are in fact exchangeable. Finally, we demonstrate the practical utility of these new tools with simulations and real-data experiments. This is joint work with Rina Barber, Emmanuel Candes and Ryan Tibshirani. A preprint is at https://arxiv.org/abs/2202.13415. Bio: Aaditya Ramdas (PhD, 2015) is an assistant professor at Carnegie Mellon University, in the Departments of Statistics and Machine Learning. He was a postdoc at UC Berkeley (2015–2018) and obtained his PhD at CMU (2010–2015), receiving the Umesh K. Gavaskar Memorial Thesis Award. His undergraduate degree was in Computer Science from IIT Bombay (2005-09). Aaditya was an inaugural inductee of the COPSS Leadership Academy, and a recipient of the 2021 Bernoulli New Researcher Award. His work is supported by an NSF CAREER Award, an Adobe Faculty Research Award (2020), an ARL Grant on Safe Reinforcement Learning, the Block Center Grant for election auditing, a Google Research Scholar award (2022) for structured uncertainty quantification, amongst others. Aaditya's main theoretical and methodological research interests include selective and simultaneous inference, game-theoretic statistics and safe anytime-valid inference, and distribution-free uncertainty quantification for black-box ML. His areas of applied interest include privacy, neuroscience, genetics and auditing (elections, real-estate, financial), and his group's work has received multiple best paper awards.
Conformal prediction beyond exchangeability (= quantifying uncertainty for black-box ML without distributional assumptions)read_more
OAS J 10
ETH AI Center, OAS, Binzmühlestrasse 13, 8050 Zürich
27 October 2022
16:15-17:15
Samory K. Kpotufe
Columbia University
Details

ETH-FDS seminar

Title Tracking Most Significant Changes in Bandits
Speaker, Affiliation Samory K. Kpotufe, Columbia University
Date, Time 27 October 2022, 16:15-17:15
Location HG F 3
Abstract In bandit with distribution shifts, one aims to automatically adapt to unknown changes in reward distribution, and restart exploration when necessary. While this problem has received attention for many years, no adaptive procedure was known till a recent breakthrough of Auer et al (2018, 2019) which guarantees an optimal regret (LT)^{1/2}, for T rounds and L stationary phases. However, while this rate is tight in the worst case, we show that significantly faster rates are possible, adaptively, if few changes in distribution are actually severe, e.g., involve no change in best arm. This is arrived at via a new notion of 'significant change', which recovers previous notions of change, and applies in both stochastic and adversarial settings (generally studied separately). If time permits, I’ll discuss the more general case of contextual bandits, i.e., where rewards depend on contexts, and highlight key challenges that arise. This is based on ongoing work with Joe Suk.
Tracking Most Significant Changes in Banditsread_more
HG F 3
3 November 2022
16:15-17:15
Holger Rauhut
RWTH Aachen
Details

ETH-FDS seminar

Title The implicit bias of gradient descent for learning linear neural networks
Speaker, Affiliation Holger Rauhut, RWTH Aachen
Date, Time 3 November 2022, 16:15-17:15
Location HG F 3
Abstract Deep neural networks are usually trained by minimizing a non-convex loss functional via (stochastic) gradient descent methods. Unfortunately, the convergence properties are not very well-understood. Moreover, a puzzling empirical observation is that learning neural networks with a number of parameters exceeding the number of training examples often leads to zero loss, i.e., the network exactly interpolates the data. Nevertheless, it generalizes very well to unseen data, which is in stark contrast to intuition from classical statistics which would predict a scenario of overfitting. A current working hypothesis is that the chosen optimization algorithm has a significant influence on the selection of the learned network. In fact, in this overparameterized context there are many global minimizers so that the optimization method induces an implicit bias on the computed solution. It seems that gradient descent methods and their stochastic variants favor networks of low complexity (in a suitable sense to be understood), and, hence, appear to be very well suited for large classes of real data. Initial attempts in understanding the implicit bias phenomen considers the simplified setting of linear networks, i.e., (deep) factorizations of matrices. This has revealed a surprising relation to the field of low rank matrix recovery (a variant of compressive sensing) in the sense that gradient descent favors low rank matrices in certain situations. Moreover, restricting further to diagonal matrices, or equivalently factorizing the entries of a vector to be recovered, leads to connections to compressive sensing and l1-minimization. After giving a general introduction to these topics, the talk will concentrate on results by the speaker on the convergence of gradient flows and gradient descent for learning linear neural networks and on the implicit bias towards low rank and sparse solutions.
The implicit bias of gradient descent for learning linear neural networksread_more
HG F 3
JavaScript has been disabled in your browser