Seminar overview

×

Modal title

Modal content

Spring Semester 2025

Date & Time	Speaker	Title	Location
Thr 20.02.2025 15:15-16:00	Rafael M. Frongillo CU Boulder	Abstract Abstract: Machine learning and data science competitions, wherein contestants submit predictions about held-out data points, are an increasingly common way to gather information and identify experts. One of the most prominent platforms is Kaggle, which has run competitions with prizes up to 3 million USD. The traditional mechanism for selecting the winner is simple: score each prediction on each held-out data point, and the contestant with the highest total score wins. Perhaps surprisingly, this reasonable and popular mechanism can incentivize contestants to submit wildly inaccurate predictions. The talk will begin with intuition for the incentive issues and what sort of strategic behavior one would expect---and when. One takeaway is that, despite conventional wisdom, large held-out data sets do not always alleviate these incentive issues, and small ones do not necessarily suffer from them, as we confirm with formal results. We will then discuss a new mechanism which is approximately truthful, in the sense that rational contestants will submit predictions which are close to their best guess. If time permits, we will see how the same mechanism solves an open question for online learning from strategic experts. Bio: Rafael (Raf) Frongillo is an Associate Professor of Computer Science at the University of Colorado Boulder. His research lies at the interface between theoretical machine learning and economics, primarily focusing on information elicitation mechanisms, which incentivize humans or algorithms to predict accurately. Before Boulder, Raf was a postdoc at the Center for Research on Computation and Society at Harvard University and at Microsoft Research New York. He received his PhD in Computer Science at UC Berkeley, advised by Christos Papadimitriou and supported by the NDSEG Fellowship. ZueKoSt: Seminar on Applied Statistics Incentive problems in data science competitions, and how to fix them	HG G 19.2
Thr 27.02.2025 16:15-17:15	David M. Blei Columbia University	Abstract A core problem in statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this talk I review and discuss innovations in variational inference (VI), a method that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and Bayesian statistics. After quickly reviewing the basics, I will discuss two lines of research in VI. I first describe stochastic variational inference, an approximate inference algorithm for handling massive datasets, and demonstrate its application to probabilistic topic models of millions of articles. Then I discuss black box variational inference, a more generic algorithm for approximating the posterior. Black box inference applies to many models but requires minimal mathematical work to implement. I will demonstrate black box inference on deep exponential families---a method for Bayesian deep learning---and describe how it enables powerful tools for probabilistic programming. Finally, I will highlight some more recent results in variational inference, including statistical theory, score-based objective functions, and interpolating between mean-field and fully dependent variational families. Research Seminar in Statistics Joint talk ETH-FDS Seminar - Research Seminar on Statistics:"Scaling and Generalizing Approximate Bayesian Inference"	HG D 1.2
Thr 27.02.2025 16:15-17:15	David M. Blei Columbia University	Abstract A core problem in statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this talk I review and discuss innovations in variational inference (VI), a method that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and Bayesian statistics. After quickly reviewing the basics, I will discuss two lines of research in VI. I first describe stochastic variational inference, an approximate inference algorithm for handling massive datasets, and demonstrate its application to probabilistic topic models of millions of articles. Then I discuss black box variational inference, a more generic algorithm for approximating the posterior. Black box inference applies to many models but requires minimal mathematical work to implement. I will demonstrate black box inference on deep exponential families---a method for Bayesian deep learning---and describe how it enables powerful tools for probabilistic programming. Finally, I will highlight some more recent results in variational inference, including statistical theory, score-based objective functions, and interpolating between mean-field and fully dependent variational families. ETH-FDS seminar Joint talk ETH-FDS Seminar - Research Seminar on Statistics: "Scaling and Generalizing Approximate Bayesian Inference"	HG D 1.2
Fri 07.03.2025 16:15-17:15	David M. Blei Columbia University	Abstract Analyzing nested data with hierarchical models is a staple of Bayesian statistics, but causal modeling remains largely focused on "flat" models. In this talk, we will explore how to think about nested data in causal models, and we will consider the advantages of nested data over aggregate data (such as data means) for causal inference. We show that disaggregating your data---replacing a flat causal model with a hierarchical causal model---can provide new opportunities for identification and estimation. As examples, we will study how to identify and estimate causal effects under unmeasured confounders, interference, and instruments. Preprint: https://arxiv.org/abs/2401.05330 This is joint work with Eli Weinstein. ETH-FDS seminar Joint talk ETH-FDS Seminar - Research Seminar on Statistics: "Hierarchical Causal Models"	HG D 7.2
Fri 07.03.2025 16:15-17:15	David M. Blei Columbia University	Abstract Analyzing nested data with hierarchical models is a staple of Bayesian statistics, but causal modeling remains largely focused on "flat" models. In this talk, we will explore how to think about nested data in causal models, and we will consider the advantages of nested data over aggregate data (such as data means) for causal inference. We show that disaggregating your data---replacing a flat causal model with a hierarchical causal model---can provide new opportunities for identification and estimation. As examples, we will study how to identify and estimate causal effects under unmeasured confounders, interference, and instruments. Preprint: https://arxiv.org/abs/2401.05330 This is joint work with Eli Weinstein. Research Seminar in Statistics Joint talk ETH-FDS Seminar - Research Seminar on Statistics:"Hierarchical Causal Models"	HG D 7.2
Thr 13.03.2025 16:15-17:15	Yinyu Ye Stanford, CUHKSZ, HKUST, and SJTU	Abstract This talk aims to present several mathematical optimization problems/algorithms for AI such as the LLM training, tunning and inferencing. In particular, we describe how classic optimization models/theories can be applied to accelerate and improve the Training/Tunning/Inferencing algorithms that are popularly used in LLMs. On the other hand, we show breakthroughs in classical Optimization (LP and SDP) Solvers aided by AI-related techniques such as first-order and ADMM methods, the low-rank SDP theories, and the GPU Implementations. Bio: Yinyu Ye is currently the K.T. Li Professor of Engineering at Department of Management Science and Engineering and Institute of Computational and Mathematical Engineering, Stanford University; and visiting chair professor of Shanghai Jiao Tong University. His current research topics include Continuous and Discrete Optimization, Data Science and Applications, Algorithm Design and Analyses, Algorithmic Game/Market Equilibrium, Operations Research and Management Science etc.; and he was one of the pioneers on Interior-Point Methods, Conic Linear Programming, Distributionally Robust Optimization, Online Linear Programming and Learning, Algorithm Analyses for Reinforcement Learning & Markov Decision Process and nonconvex optimization, and etc. He and his students have received numerous scientific awards, himself including the 2006 INFORMS Farkas Prize (Inaugural Recipient) for fundamental contributions to optimization, the 2009 John von Neumann Theory Prize for fundamental sustained contributions to theory in Operations Research and the Management Sciences, the inaugural 2012 ISMP Tseng Lectureship Prize for outstanding contribution to continuous optimization (every three years), the 2014 SIAM Optimization Prize awarded (every three years). ETH-FDS seminar Mathematical Optimization in the Era of AI	HG G 19.1
Fri 14.03.2025 15:15-16:00	Matteo Fontana Royal Holloway, University of London	Abstract Quantifying uncertainty in multivariate regression is crucial across many real-world applications. However, existing approaches for constructing prediction regions often struggle to capture complex dependencies, lack formal coverage guarantees, or incur high computational costs. Conformal prediction addresses these challenges by providing a robust, distribution-free framework with finite-sample coverage guarantees. In this study, we offer a unified comparison of multi-output conformal techniques, highlighting their properties and interrelationships. Leveraging these insights, we propose two families of conformity scores that achieve asymptotic conditional coverage: one can be paired with any generative model, while the other reduces computational overhead by utilizing invertible generative models. We then present a large-scale empirical analysis on 32 tabular datasets, comparing all methods under a consistent code base to ensure fairness and reproducibility. Research Seminar in Statistics Multi-Output Conformal Regression: A Unified View with Comparisons	HG G 19.1
Thr 20.03.2025 16:15-17:15	Stefan Wager Stanford University	Abstract The time at which renewable (e.g., solar or wind) energy resources produce electricity cannot generally be controlled. In many settings, consumers have some flexibility in their energy consumption needs, and there is growing interest in demand-response programs that leverage this flexibility to shift energy consumption to better match renewable production -- thus enabling more efficient utilization of these resources. We study optimal demand response in a model where consumers operate home energy management systems (HEMS) that can compute the "indifference set" of energy-consumption profiles that meet pre-specified consumer objectives, receive demand-response signals from the grid, and control consumer devices within the indifference set. For example, if a consumer asks for the indoor temperature to remain between certain upper and lower bounds, a HEMS could time use of air conditioning or heating to align with high renewable production when possible. Here, we show that while price-based mechanisms do not in general achieve optimal demand response, i.e., dynamic pricing cannot induce HEMS to choose optimal demand consumption profiles within the available indifference sets, pricing is asymptotically optimal in a mean-field limit with a growing number of consumers. Furthermore, we show that large-sample optimal dynamic prices can be efficiently derived via an algorithm that only requires querying HEMS about their planned consumption schedules given different prices. We demonstrate our approach in a grid simulation powered by OpenDSS, and show that it achieves meaningful demand response without creating grid instability. Mohammad Mehrabi, Omer Karaduman, Stefan Wager https://arxiv.org/abs/2409.07655 ETH-FDS seminar Joint talk ETH-FDS Seminar - Research Seminar on Statistics: "Optimal Mechanisms for Demand Response: An Indifference Set Approach"	HG E 3
Thr 20.03.2025 16:15-17:15	Stefan Wager Stanford University	Abstract The time at which renewable (e.g., solar or wind) energy resources produce electricity cannot generally be controlled. In many settings, consumers have some flexibility in their energy consumption needs, and there is growing interest in demand-response programs that leverage this flexibility to shift energy consumption to better match renewable production -- thus enabling more efficient utilization of these resources. We study optimal demand response in a model where consumers operate home energy management systems (HEMS) that can compute the "indifference set" of energy-consumption profiles that meet pre-specified consumer objectives, receive demand-response signals from the grid, and control consumer devices within the indifference set. For example, if a consumer asks for the indoor temperature to remain between certain upper and lower bounds, a HEMS could time use of air conditioning or heating to align with high renewable production when possible. Here, we show that while price-based mechanisms do not in general achieve optimal demand response, i.e., dynamic pricing cannot induce HEMS to choose optimal demand consumption profiles within the available indifference sets, pricing is asymptotically optimal in a mean-field limit with a growing number of consumers. Furthermore, we show that large-sample optimal dynamic prices can be efficiently derived via an algorithm that only requires querying HEMS about their planned consumption schedules given different prices. We demonstrate our approach in a grid simulation powered by OpenDSS, and show that it achieves meaningful demand response without creating grid instability. Mohammad Mehrabi, Omer Karaduman, Stefan Wager https://arxiv.org/abs/2409.07655 Research Seminar in Statistics Joint talk ETH-FDS Seminar - Research Seminar on Statistics: "Optimal Mechanisms for Demand Response: An Indifference Set Approach"	HG E 3
Tue 01.04.2025 17:15-18:30	Hyunju Kwon Yuansi Chen ETH Zurich	Abstract More information: https://math.ethz.ch/sfs/news-and-events/seminar-overview.html SfS Special Events Inaugural lectures: The cost of turning random walks into reliable statistical uncertainty quantification (Prof. Dr. Chen) & A glimpse of hydrodynamic turbulence (Prof. Dr. Kwon)	HG F 30
Wed 02.04.2025 15:15-16:00	Linbo Wang University of Toronto	Abstract In many observational studies, researchers are often interested in studying the effects of multiple exposures on a single outcome. Standard approaches for high-dimensional data such as the lasso assume the associations between the exposures and the outcome are sparse. These methods, however, do not estimate the causal effects in the presence of unmeasured confounding. In this paper, we consider an alternative approach that assumes the causal effects in view are sparse. We show that with sparse causation, the causal effects are identifiable even with unmeasured confounding. At the core of our proposal is a novel device, called the synthetic instrument, that in contrast to standard instrumental variables, can be constructed using the observed exposures directly. We show that under linear structural equation models, the problem of causal effect estimation can be formulated as an ℓ0-penalization problem, and hence can be solved efficiently using off-the-shelf software. Simulations show that our approach outperforms state-of-art methods in both low-dimensional and high-dimensional settings. We further illustrate our method using a mouse obesity dataset. Research Seminar in Statistics The synthetic instrument: From sparse association to sparse causation	HG G 19.1
Fri 11.04.2025 15:15-16:15	Victoria Stodden University of Southern California	Abstract In the last 10 years colossal cloud infrastructure investments behind the rise of near-ubiquitous global mobile technologies have trickled down to scientific research through innovative infrastructure including cloud compute and storage, I/O tools, data analysis and modeling frameworks, which in turn have generated broad and expanding communities of users and supporters. Arguably, the recent success of Large Language Models were catalyzed by the resulting technological innovations of 1) open and accessible massive data, and 2) re-executable discovery pipelines for model estimation and prediction. These changes are deeply disruptive to the research community since they open new paths to knowledge creation that were previously inaccessible and largely culturally unknown. The scientific community is faced with the challenge of responding to changes in research modalities due to these technological innovations. Research is now conducted as an “Olympics” of benchmarked competitions between Machine Learning models leveraged by the opaque results of Large Language Models, access to massive data, and redeployment of complex scientific discovery workflows. In this seminar I provide a roadmap of challenges and responses by various stakeholders in the research community to ensure that scientific results remain reliable and reproducible, and secure within a position of trust in the broader society. ZueKoSt: Seminar on Applied Statistics Levering AI in Scientific Research: Transparency, Reproducibility, and Trust	HG G 19.1
Tue 29.04.2025 13:15-14:15	Susan Wei Monash University, Australia	Abstract tba Research Seminar in Statistics Title T.B.A.	HG G 19.2
Thr 08.05.2025 15:15-16:15	Toby Hocking	Abstract data.table is an R package with C code that is one of the most efficient open-source in-memory database packages available today. First released to CRAN by Matt Dowle in 2006, it continues to grow in popularity, and now over 1500 other CRAN packages depend on data.table. This talk will discuss basic and advanced data manipulation topics, and end with a discussion about how you can contribute to data.table. ZueKoSt: Seminar on Applied Statistics Using and contributing to the data.table package for efficient big data analysis	HG
Mon 12.05.2025 17:15-18:15	Andrew Stuart Caltech	Abstract ETH-FDS Stiefel Lectures Stiefel Lecture 2025	HG F 30
Wed 25.06.2025		Abstract More information: https://math.ethz.ch/fim/activities/conferences/High-dimensional-statistics-applications-and-distributional-shifts.html SfS Special Events High-dimensional statistics, applications, and distributional shifts -- A workshop in celebration of Peter Bühlmann's 60th birthday“, June 2025	HG F 3
Thr 26.06.2025		Abstract More information: https://math.ethz.ch/fim/activities/conferences/High-dimensional-statistics-applications-and-distributional-shifts.html SfS Special Events High-dimensional statistics, applications, and distributional shifts -- A workshop in celebration of Peter Bühlmann's 60th birthday“, June 2025	HG F 3
Fri 27.06.2025		Abstract More information: https://math.ethz.ch/fim/activities/conferences/High-dimensional-statistics-applications-and-distributional-shifts.html SfS Special Events High-dimensional statistics, applications, and distributional shifts -- A workshop in celebration of Peter Bühlmann's 60th birthday“, June 2025	HG F 3

Archive: AS 25 SS 25 AS 24 SS 24 AS 23 SS 23 AS 22 SS 22 AS 21 SS 21 AS 20 SS 20 AS 19 SS 19 AS 18 SS 18 AS 17 SS 17 AS 16 SS 16 AS 15 SS 15 AS 14 SS 14 AS 13 SS 13 AS 12 SS 12 AS 11 SS 11 AS 10 SS 10 AS 09 WS 99/00