Home  | Events
Teaser image to Data Thinning and beyond

Colloquium

Data Thinning and Beyond

Daniela Witten, University of Washington

   06.05.2026

   4:15 pm - 5:45 pm

   LMU Munich, Department of Statistics and via zoom

The lecture describes the problem of reusing the same dataset in data analysis, for example for both hypothesis generation and subsequent testing. This double use creates dependencies that can invalidate classical statistical inference methods.

As a solution, “data thinning” is introduced, a method for splitting data into independent training and test sets to enable valid inference. However, this approach requires strong assumptions about the data distribution. Therefore, alternative strategies are presented that avoid such assumptions, for example by adjusting summary statistics or orthogonalizing dependent datasets.

Daniela Witten is a professor of Statistics and Biostatistics at University of Washington, and the Dorothy Gilford Endowed Chair in Mathematical Statistics. She develops statistical machine learning methods for high-dimensional data, with a focus on unsupervised learning.


Related

Link to Analyzing Feature Interactions through Local Effects in Machine Learning Models

Lecture  •  12.06.2026  •  LMU Munich, CAS, Seestr. 13, Munich

Analyzing Feature Interactions Through Local Effects in Machine Learning Models

As part of the CAS Research Focus, Giuseppe Casalicchio talks about interpretable machine learning that develops methods.

Read more
Back to Top