Young Data Science Researcher Seminar Zurich

Modal title

Modal content

Please subscribe here if you would you like to be notified about these events via e-mail. Moreover you can also subscribe to the iCal/ics Calender.

Autumn Semester 2022

Date / Time

Speaker

Title

Location

26 October 2022
16:00-17:00

Muriel Pérez
Centrum Wiskunde & Informatica (CWI) Amsterdam

Details

Young Data Science Researcher Seminar Zurich

Title	E-statistics, group invariance and anytime-valid testing
Speaker, Affiliation	Muriel Pérez, Centrum Wiskunde & Informatica (CWI) Amsterdam
Date, Time	26 October 2022, 16:00-17:00
Location
Abstract	We study worst-case-growth-rate-optimal (GROW) E-statistics for hypothesis testing between two dominated group models. If the underlying group G acts freely on the observation space, there exists a maximally invariant statistic of the data. We show that among all E-statistics, invariant or not, the likelihood ratio of the maximally invariant statistic is GROW and that an anytime-valid test can be based on it. By virtue of a representation theorem of Wijsman, the GROW E-statistic is equivalent to a Bayes factor with a right Haar prior on G. Such Bayes factors are known to have good frequentist and Bayesian properties. We show that reductions through sufficiency and invariance can be made in tandem without affecting optimality. A crucial assumption on the group G is its amenability, a well-known group-theoretical condition, which holds, for instance, in general scale-location families. Our results also apply to finite-dimensional linear regression.

E-statistics, group invariance and anytime-valid testingread_more

10 November 2022
16:30-17:30

Asaf Weinstein
Hebrew University of Jerusalem

Details

Young Data Science Researcher Seminar Zurich

Title	On permutation invariant problems in simultaneous statistical inference
Speaker, Affiliation	Asaf Weinstein, Hebrew University of Jerusalem
Date, Time	10 November 2022, 16:30-17:30
Location
Abstract	Suppose you observe Y_i = mu_i + e_i, where e_i are i.i.d. from some fixed and known zero-mean distribution, and mu_i are fixed and unknown parameters. In this "canonical" setting, a simultaneous inference statistical problem, as we define it here, is such that no preference is given to any of the mu_i's before seeing the data. For example, estimating all mu_i's under sum-of-squares loss; or testing H_{0i}: mu_i=0 simultaneously for i=1,...,n while controlling FDR; or estimating mu_{i^} where i^ = argmax{Y_i} under squared loss; or even testing the global null H_0 = \cap{i=1}^n H_{0i}. What is the optimal solution to a simultaneous inference problem? In a Bayesian setup, i.e., when mu_i are assumed random, the answer is conceptually straightforward. In the frequentist setup considered here, the answer is far less obvious, and various approaches exist for defining notions of frequentist optimality and for designing procedures that pursue them. In this work we define the optimal solution to a simultaneous inference problem to be the procedure that, for the true mu_i's, has the best performance among all procedures that are oblivious to the labels i=1,...,n. This is a natural and arguably the weakest condition one could possibly impose. For such procedures we observe that the problem can be cast as a Bayesian problem with respect to a particular prior, which immediately reveals an explicit form for the optimal solution. The argument actually holds more generally for any permutation-invariant model, e.g. when the e_i above are exchangeable, not independent, noise terms, which is sometimes a much more realistic assumption. Finally, we discuss the relation to Robbins's empirical Bayes approach, and explain why nonparametric empirical Bayes procedures should, at least when the e_i's are independent, asymptotically attain the optimal performance uniformly in the parameter value.

On permutation invariant problems in simultaneous statistical inferenceread_more

17 November 2022
16:00-17:00

Qiuqi Wang
University of Waterloo

Details

Young Data Science Researcher Seminar Zurich

Title	E-backtesting risk measures
Speaker, Affiliation	Qiuqi Wang, University of Waterloo
Date, Time	17 November 2022, 16:00-17:00
Location
Abstract	In the recent Basel Accords, the Expected Shortfall (ES) replaces the Value-at-Risk (VaR) as the standard risk measure for market risk in the banking sector, making it the most important risk measure in financial regulation. One of the most challenging tasks in risk modeling practice is to backtest ES forecasts provided by financial institutions. Ideally, backtesting should be done based only on daily realized portfolio losses without imposing specific models. Recently, the notion of e-values has gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. We use e-values and e-processes to construct a model-free backtesting procedure for ES using a concept of universal e-statistics, which can be naturally generalized to many other risk measures and statistical quantities.

E-backtesting risk measuresread_more

1 December 2022
16:00-17:00

Niklas Pfister
University of Copenhagen

Details

Young Data Science Researcher Seminar Zurich

Title	Joint talk of SfS Research Seminar and Young Data Science Researcher Seminar Zurich - tba
Speaker, Affiliation	Niklas Pfister, University of Copenhagen
Date, Time	1 December 2022, 16:00-17:00
Location	HG G 19.1
Abstract	tba

Joint talk of SfS Research Seminar and Young Data Science Researcher Seminar Zurich - tbaread_more

HG G 19.1

8 December 2022
16:00-17:00

Yuling Yan
Princeton University

Details

Young Data Science Researcher Seminar Zurich

Title	Inference and Uncertainty Quantification for Low-Rank Models
Speaker, Affiliation	Yuling Yan, Princeton University
Date, Time	8 December 2022, 16:00-17:00
Location
Abstract	Many high-dimensional problems involve reconstruction of a low-rank matrix from highly incomplete and noisy observations. Despite substantial progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained low-rank estimates, and how to construct valid yet short confidence intervals for the unknown low-rank matrix. In this talk, I will discuss how to perform inference and uncertainty quantification for two widely encountered low-rank models: (1) noisy matrix completion, and (2) heteroskedastic PCA with missing data. For both problems, we identify statistically efficient estimators that admit non-asymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals for, say, the unseen entries of the low-rank matrix of interest. Our inferential procedures do not rely on sample splitting, thus avoiding unnecessary loss of data efficiency. All this is accomplished by a powerful leave-one-out analysis framework that originated from probability and random matrix theory. This is based on joint work with Yuxin Chen, Jianqing Fan and Cong Ma.

Inference and Uncertainty Quantification for Low-Rank Modelsread_more

15 December 2022
16:00-17:00

Yixin Wang
University of Michigan

Details

Young Data Science Researcher Seminar Zurich

Title	Representation Learning: A Causal Perspective
Speaker, Affiliation	Yixin Wang, University of Michigan
Date, Time	15 December 2022, 16:00-17:00
Location
Abstract	Representation learning constructs low-dimensional representations to summarize essential features of high-dimensional data like images and texts. Ideally, such a representation should efficiently capture non-spurious features of the data. It shall also be disentangled so that we can interpret what feature each of its dimensions captures. However, these desiderata are often intuitively defined and challenging to quantify or enforce. In this talk, we take on a causal perspective of representation learning. We show how desiderata of representation learning can be formalized using counterfactual notions, enabling metrics and algorithms that target efficient, non-spurious, and disentangled representations of data. We discuss the theoretical underpinnings of the algorithm and illustrate its empirical performance in both supervised and unsupervised representation learning. This is joint work with Michael Jordan: https://arxiv.org/abs/2109.03795

Representation Learning: A Causal Perspectiveread_more

Archive: SS 24 AS 23 SS 23 AS 22 SS 22 AS 21 SS 21 AS 20 SS 20