ZüKoSt: Seminar on Applied Statistics

Would you like to be notified about these presentations via e-mail? Please subscribe here.

×

Modal title

Modal content

Spring Semester 2025

Date / Time Speaker Title Location
20 February 2025
15:15-16:00
Rafael M. Frongillo
CU Boulder
Details

ZueKoSt: Seminar on Applied Statistics

Title Incentive problems in data science competitions, and how to fix them
Speaker, Affiliation Rafael M. Frongillo, CU Boulder
Date, Time 20 February 2025, 15:15-16:00
Location HG G 19.2
Abstract Abstract: Machine learning and data science competitions, wherein contestants submit predictions about held-out data points, are an increasingly common way to gather information and identify experts. One of the most prominent platforms is Kaggle, which has run competitions with prizes up to 3 million USD. The traditional mechanism for selecting the winner is simple: score each prediction on each held-out data point, and the contestant with the highest total score wins. Perhaps surprisingly, this reasonable and popular mechanism can incentivize contestants to submit wildly inaccurate predictions. The talk will begin with intuition for the incentive issues and what sort of strategic behavior one would expect---and when. One takeaway is that, despite conventional wisdom, large held-out data sets do not always alleviate these incentive issues, and small ones do not necessarily suffer from them, as we confirm with formal results. We will then discuss a new mechanism which is approximately truthful, in the sense that rational contestants will submit predictions which are close to their best guess. If time permits, we will see how the same mechanism solves an open question for online learning from strategic experts.
Incentive problems in data science competitions, and how to fix themread_more
HG G 19.2
11 April 2025
15:15-16:15
Victoria Stodden
University of Southern California
Details

ZueKoSt: Seminar on Applied Statistics

Title Title T.B.A.
Speaker, Affiliation Victoria Stodden, University of Southern California
Date, Time 11 April 2025, 15:15-16:15
Location HG G 19.1
Abstract tba
Title T.B.A.read_more
HG G 19.1
8 May 2025
15:15-16:15
Toby Hocking

Details

ZueKoSt: Seminar on Applied Statistics

Title Using and contributing to the data.table package for efficient big data analysis
Speaker, Affiliation Toby Hocking ,
Date, Time 8 May 2025, 15:15-16:15
Location HG
Abstract data.table is an R package with C code that is one of the most efficient open-source in-memory database packages available today. First released to CRAN by Matt Dowle in 2006, it continues to grow in popularity, and now over 1500 other CRAN packages depend on data.table. This talk will discuss basic and advanced data manipulation topics, and end with a discussion about how you can contribute to data.table.
Using and contributing to the data.table package for efficient big data analysisread_more
HG

Notes: the highlighted event marks the next occurring event and if you want you can subscribe to the iCal/ics Calender.

JavaScript has been disabled in your browser