Young Data Science Researcher Seminar Zurich

×

Modal title

Modal content

Please subscribe here if you would you like to be notified about these presentations via e-mail. Moreover you can subscribe to the iCal/ics Calender.

Autumn Semester 2022

Date / Time Speaker Title Location
26 October 2022
16:00-17:00
Muriel Pérez
Centrum Wiskunde & Informatica (CWI) Amsterdam
Event Details

Young Data Science Researcher Seminar Zurich

Title E-statistics, group invariance and anytime-valid testing
Speaker, Affiliation Muriel Pérez, Centrum Wiskunde & Informatica (CWI) Amsterdam
Date, Time 26 October 2022, 16:00-17:00
Location Zoom Call
Abstract We study worst-case-growth-rate-optimal (GROW) E-statistics for hypothesis testing between two dominated group models. If the underlying group G acts freely on the observation space, there exists a maximally invariant statistic of the data. We show that among all E-statistics, invariant or not, the likelihood ratio of the maximally invariant statistic is GROW and that an anytime-valid test can be based on it. By virtue of a representation theorem of Wijsman, the GROW E-statistic is equivalent to a Bayes factor with a right Haar prior on G. Such Bayes factors are known to have good frequentist and Bayesian properties. We show that reductions through sufficiency and invariance can be made in tandem without affecting optimality. A crucial assumption on the group G is its amenability, a well-known group-theoretical condition, which holds, for instance, in general scale-location families. Our results also apply to finite-dimensional linear regression.
E-statistics, group invariance and anytime-valid testingread_more
Zoom Call
10 November 2022
16:30-17:30
Asaf Weinstein
Hebrew University of Jerusalem
Event Details

Young Data Science Researcher Seminar Zurich

Title On permutation invariant problems in simultaneous statistical inference
Speaker, Affiliation Asaf Weinstein, Hebrew University of Jerusalem
Date, Time 10 November 2022, 16:30-17:30
Location Zoom Call
Abstract Suppose you observe Y_i = mu_i + e_i, where e_i are i.i.d. from some fixed and known zero-mean distribution, and mu_i are fixed and unknown parameters. In this "canonical" setting, a simultaneous inference statistical problem, as we define it here, is such that no preference is given to any of the mu_i's before seeing the data. For example, estimating all mu_i's under sum-of-squares loss; or testing H_{0i}: mu_i=0 simultaneously for i=1,...,n while controlling FDR; or estimating mu_{i^*} where i^* = argmax{Y_i} under squared loss; or even testing the global null H_0 = \cap{i=1}^n H_{0i}. What is the optimal solution to a simultaneous inference problem? In a Bayesian setup, i.e., when mu_i are assumed random, the answer is conceptually straightforward. In the frequentist setup considered here, the answer is far less obvious, and various approaches exist for defining notions of frequentist optimality and for designing procedures that pursue them. In this work we define the optimal solution to a simultaneous inference problem to be the procedure that, for the true mu_i's, has the best performance among all procedures that are oblivious to the labels i=1,...,n. This is a natural and arguably the weakest condition one could possibly impose. For such procedures we observe that the problem can be cast as a Bayesian problem with respect to a particular prior, which immediately reveals an explicit form for the optimal solution. The argument actually holds more generally for any permutation-invariant model, e.g. when the e_i above are exchangeable, not independent, noise terms, which is sometimes a much more realistic assumption. Finally, we discuss the relation to Robbins's empirical Bayes approach, and explain why nonparametric empirical Bayes procedures should, at least when the e_i's are independent, asymptotically attain the optimal performance uniformly in the parameter value.
On permutation invariant problems in simultaneous statistical inferenceread_more
Zoom Call
17 November 2022
16:00-17:00
Qiuqi Wang
University of Waterloo
Event Details

Young Data Science Researcher Seminar Zurich

Title E-backtesting risk measures
Speaker, Affiliation Qiuqi Wang, University of Waterloo
Date, Time 17 November 2022, 16:00-17:00
Location Zoom Call
Abstract In the recent Basel Accords, the Expected Shortfall (ES) replaces the Value-at-Risk (VaR) as the standard risk measure for market risk in the banking sector, making it the most important risk measure in financial regulation. One of the most challenging tasks in risk modeling practice is to backtest ES forecasts provided by financial institutions. Ideally, backtesting should be done based only on daily realized portfolio losses without imposing specific models. Recently, the notion of e-values has gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. We use e-values and e-processes to construct a model-free backtesting procedure for ES using a concept of universal e-statistics, which can be naturally generalized to many other risk measures and statistical quantities.
E-backtesting risk measuresread_more
Zoom Call
1 December 2022
16:00-17:00
Niklas Pfister
University of Copenhagen
Event Details

Young Data Science Researcher Seminar Zurich

Title Joint talk of SfS Research Seminar and Young Data Science Researcher Seminar Zurich - tba
Speaker, Affiliation Niklas Pfister, University of Copenhagen
Date, Time 1 December 2022, 16:00-17:00
Location HG G 19.1
Abstract tba
Joint talk of SfS Research Seminar and Young Data Science Researcher Seminar Zurich - tbaread_more
HG G 19.1
8 December 2022
16:00-17:00
Yuling Yan
Princeton University
Event Details

Young Data Science Researcher Seminar Zurich

Title Inference and Uncertainty Quantification for Low-Rank Models
Speaker, Affiliation Yuling Yan, Princeton University
Date, Time 8 December 2022, 16:00-17:00
Location Zoom Call
Abstract Many high-dimensional problems involve reconstruction of a low-rank matrix from highly incomplete and noisy observations. Despite substantial progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained low-rank estimates, and how to construct valid yet short confidence intervals for the unknown low-rank matrix. In this talk, I will discuss how to perform inference and uncertainty quantification for two widely encountered low-rank models: (1) noisy matrix completion, and (2) heteroskedastic PCA with missing data. For both problems, we identify statistically efficient estimators that admit non-asymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals for, say, the unseen entries of the low-rank matrix of interest. Our inferential procedures do not rely on sample splitting, thus avoiding unnecessary loss of data efficiency. All this is accomplished by a powerful leave-one-out analysis framework that originated from probability and random matrix theory. This is based on joint work with Yuxin Chen, Jianqing Fan and Cong Ma.
Inference and Uncertainty Quantification for Low-Rank Modelsread_more
Zoom Call
15 December 2022
16:00-17:00
Yixin Wang
University of Michigan
Event Details

Young Data Science Researcher Seminar Zurich

Title Representation Learning: A Causal Perspective
Speaker, Affiliation Yixin Wang, University of Michigan
Date, Time 15 December 2022, 16:00-17:00
Location Zoom Call
Abstract Representation learning constructs low-dimensional representations to summarize essential features of high-dimensional data like images and texts. Ideally, such a representation should efficiently capture non-spurious features of the data. It shall also be disentangled so that we can interpret what feature each of its dimensions captures. However, these desiderata are often intuitively defined and challenging to quantify or enforce. In this talk, we take on a causal perspective of representation learning. We show how desiderata of representation learning can be formalized using counterfactual notions, enabling metrics and algorithms that target efficient, non-spurious, and disentangled representations of data. We discuss the theoretical underpinnings of the algorithm and illustrate its empirical performance in both supervised and unsupervised representation learning. This is joint work with Michael Jordan: https://arxiv.org/abs/2109.03795
Representation Learning: A Causal Perspectiveread_more
Zoom Call
JavaScript has been disabled in your browser