Seminar overview

×

Modal title

Modal content

Autumn Semester 2024

Date & Time Speaker Title Location
Fri 04.10.2024
15:15-16:15
Samuel Pawel
Center for Reproducible Science, UZH
Abstract
Simulation studies are widely used in methodological research fields such as statistics, psychometrics, bioinformatics, ecology, econometrics, or machine learning. They generate artificial data sets under specified mechanisms and compare the performance of data analysis methods under different conditions. Careful design, analysis, and reporting of simulation studies is important because they often provide the basis for data analysis decisions in scientific and medical practice. Problems with the reporting of simulation studies were first described nearly half a century ago. Recent attention to reproducibility issues in the biomedical and social sciences has led to more critical reflection on simulation studies and new proposals for improving them. In this talk, I will provide an overview of these recent meta-scientific developments. I will also discuss how questionable research practices (QRPs), such as cherry-picking of favorable results, can affect the validity of simulation studies. To illustrate this point, I present a simulation study of a novel prediction method with no expected performance gain, and show how easy it is to make the method appear superior to well-established competing methods when QRPs are employed. I also discuss approaches for addressing QRPs, and present a newly developed template for preregistration of simulation studies. References: - Pawel, S., Kook, L., Reeve, K. (2024). Pitfalls and potentials in simulation studies: Questionable research practices in comparative simulation studies allow for spurious claims of superiority of any method. Biometrical Journal. https://doi.org/10.1002/bimj.202200091 - Siepe, B.S., Bartos, F., Morris, T.P., Boulesteix, A.-L., Heck, D.W., Pawel, S. (2024). Simulation Studies for Methodological Research in Psychology: A Standardized Template for Planning, Preregistration, and Reporting. Psychological Methods (to appear). https://doi.org/10.31234/osf.io/ufgy6
ZueKoSt: Seminar on Applied Statistics
Meta-scientific perspectives on simulation studies
HG G 19.1
Thr 10.10.2024
15:15-16:15
Lars Lorch
Institute for Machine Learning, ETH Zürich
Abstract
In this talk, we develop a novel approach to causal modeling and inference. Rather than structural equations over a causal graph, we show how to learn stochastic differential equations (SDEs) whose stationary densities model a system's behavior under interventions. These stationary diffusion models do not require the formalism of causal graphs, let alone the common assumption of acyclicity, and often generalize to unseen interventions on their variables. Our inference method is based on a new theoretical result that expresses a stationarity condition on the diffusion's generator in a reproducing kernel Hilbert space. The resulting kernel deviation from stationarity (KDS) is an objective function of independent interest.
Research Seminar in Statistics
Causal Modeling with Stationary Diffusions
HG G 19.1
Thr 17.10.2024
17:15-18:15
Ziegel Johanna
ETH Zürich
HG F 30
Wed 06.11.2024
15:00-16:00
Peter Whalley
ETH Zurich, Seminar for Statistics
Abstract
We present an unbiased method for Bayesian posterior means based on kinetic Langevin dynamics that combines advanced splitting methods with enhanced gradient approximations. Our approach avoids Metropolis correction by coupling Markov chains at different discretization levels in a multilevel Monte Carlo approach. Theoretical analysis demonstrates that our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem. We prove similar results using both approximate and stochastic gradients and show that our method's computational cost scales independently of the size of the dataset. Our numerical experiments demonstrate that our unbiased algorithm outperforms the "gold-standard" randomized Hamiltonian Monte Carlo.
Research Seminar in Statistics
Invited talk: Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients
HG G 19.1
Thr 07.11.2024
15:15-16:15
Zhijing Jin
Incoming Assistant Professor at the University of Toronto; PhD at Max Planck Institute & ETH
Abstract
Causal reasoning is a cornerstone of human intelligence and a critical capability for artificial systems aiming to achieve advanced understanding and decision-making. While large language models (LLMs) excel on many tasks, a key question still remains: How can these models reason better about causality? Causal questions that humans can pose span a wide range of fields, from Newton’s fundamental question, “Why do apples fall?” which LLMs can now retrieve from standard textbook knowledge, to complex inquiries such as, “What are the causal effects of minimum wage introduction?”—a topic recognized with the 2021 Nobel Prize in Economics. My research focuses on automating causal reasoning across all types of questions. To achieve this, I explore the causal reasoning capabilities that have emerged in state-of-the-art LLMs, and enhance their ability to perform causal inference by guiding them through structured, formal steps. Finally, I will outline a future research agenda for building the next generation of LLMs capable of scientific-level causal reasoning. https://zhijing-jin.com/fantasy/about/
ZueKoSt: Seminar on Applied Statistics
The Potential of Automating Causal Inference with Large Language Models
HG G 19.1
Thr 14.11.2024
15:15-16:15
Chen Zhou
Erasmus University, Rotterdam
Abstract
When applying multivariate extreme values statistics to analyze tail risk in compound events defined by a multivariate random vector, one often assumes that all dimensions share the same extreme value index. While such an assumption can be tested using a Wald-type test, the performance of such a test deteriorates as the dimensionality increases. This paper introduces a novel test for testing extreme value indices in a high dimensional setting. We show the asymptotic behavior of the test statistic and conduct simulation studies to evaluate its finite sample performance. The proposed test significantly outperforms existing methods in high dimensional settings. We apply this test to examine two datasets previously assumed to have identical extreme value indices across all dimensions. This is a joint work with Liujun Chen (USTC)
Research Seminar in Statistics
High dimensional inference for extreme value indices
HG E 41
Fri 15.11.2024
15:15-16:15
Jan Dirk Wegner
Department of Mathematical Modeling and Machine Learning (DM3L), University of Zurich
Abstract
Modern deep learning in combination with satellite data offers great opportunities to protect nature at global scale. I will present ongoing research to map crops at country-scale, for species distribution modeling, to estimate vegetation parameters such as biomass and vegetation height, and how conflicts can be monitored remotely. Traditional approaches usually must be adapted for specific ecosystems and regions. It is therefore very difficult to carry out homogeneous, large-scale modeling with high spatial and temporal resolution and, at the same time, good accuracy. Data-driven approaches, especially modern deep learning methods, promise great potential here to achieve globally consistent, transparent assessments of our environment. Bio: Jan Dirk Wegner leads the EcoVision Lab at the DM3L at University of Zurich as an Associate Professor. Jan was PostDoc (2012-2016) and senior scientist (2017-2020) in the Photogrammetry and Remote Sensing group at ETH Zurich after completing his PhD (with distinction) at Leibniz Universität Hannover in 2011. His main research interests are at the frontier of machine learning, computer vision, and remote sensing to solve scientific questions in the environmental sciences and geosciences. Jan was granted multiple awards, among others an ETH Postdoctoral fellowship and the science award of the German Geodetic Commission. He was selected for the WEF Young Scientist Class 2020 as one of the 25 best researchers world-wide under the age of 40 committed to integrating scientific knowledge into society for the public good. Jan is vice-president of ISPRS Technical Commission II, associated faculty of the ETH AI Center, director of the PhD graduate school "Data Science" at University of Zurich, and his professorship is part of the Digital Society Initiative at University of Zurich. Together with colleagues, Jan is chairing the CVPR EarthVision workshops.
ZueKoSt: Seminar on Applied Statistics
Monitoring Earth with Remote Sensing and Deep Learning
HG E 41
Wed 27.11.2024
15:15-16:15
Frederic Koehler
University of Chicago
Abstract
In his 1975 paper "Statistical Analysis of Non-Lattice Data", Julian Besag proposed the pseudolikelihood method as an alternative to the standard method of maximum likelihood estimation. This method has been very influential and successful in applications like learning graphical models from data, and also inspired another related and important method called score matching. I will discuss some recent work which connects the statistical efficiency of these estimators to the computational efficiency of related sampling algorithms.
Research Seminar in Statistics
Pseudolikelihood, Score Matching, and Dynamics
HG G 19.1
Fri 29.11.2024
15:15-16:00
David Wissel
Boeva Lab, ETHZ
Abstract
Survival analysis has been a task of significant interest within the statistics community throughout the years. More recently, the machine learning and bioinformatics communities have also increasingly become interested in survival analysis. In this seminar, we survey recent developments focusing especially on high-dimensional multi-omics survival analysis. We concentrate particularly on empirical evaluations highlighting the difficulty of this task, the need for more data, and standardized evaluation. In the second part of the seminar, we discuss recent work on knowledge distillation for sparse survival models and methods for structure selection in sparse partially linear survival models.
ZueKoSt: Seminar on Applied Statistics
Empirical evaluations and methods for (multi-)omics survival analysis
HG G 19.1
Thr 05.12.2024
16:15-17:15
Mats Stensrud
EPFL
Abstract
The exposure of an individual i often affects the outcome of another individual j. A prominent example occurs in infectious disease settings, where vaccinating one individual can reduce disease transmission and thereby affect the health outcomes of others. This spillover effect is a type of interference, which implies that individuals cannot plausibly be perceived as independent and identically distributed (iid). Extensive methodological research has recently been motivated by interference problems and the violation of conventional iid assumptions. However, despite the growing interest in the topic, there remains controversy over whether and when existing methods capture effects of practical interest. In this talk, I will present causal methodology—motivated by infectious disease settings—for addressing interference. The central idea is to consider estimands that are insensitive to the interference structure. I will argue that these estimands have a clear interpretation and can be used to guide decisions made by, for example, doctors and patients facing infectious diseases. The methodology will be illustrated with examples of the effects of vaccines against HIV and influenza.
ETH-FDS seminar
On Policy-Relevant Effects in the Presence of Interference
HG D 7.1
Fri 06.12.2024
15:15-16:15
Siddhartha Mishra
ETHZ
Abstract
PDEs are considered to be language of physics as they provide mathematical descriptions of a whole range of physical phenomena. The complexity and prohibitive computational cost of traditional physics-based numerical schemes necessitates the search for fast and efficient surrogates, based on machine learning. In this lecture, we survey recent developments in the field of learning solution operators for PDEs by focussing on structure preserving neural operators and on foundation models for sample efficient and generalizable multi-operator learning. We also briefly discuss graph neural network based learning of PDEs on arbitrary domain geometries and conditional Diffusion models for learning multi-scale physical systems such as Turbulent Fluid Flows. 
ZueKoSt: Seminar on Applied Statistics
Learning PDEs
HG G 19.1
JavaScript has been disabled in your browser