Home | Research | Groups | Thomas Nagler

Research Group Thomas Nagler

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Thomas Nagler

is Professor of Computational Statistics & Data Science at LMU Munich.

His research is at the intersection of mathematical and computational statistics. He develops statistical methods, derives theoretical guarantees and scalable algorithms, packages them in user-friendly software, and collaborates with domain experts to solve problems in diverse areas.

Team members @MCML

Link to Nicolai Palm

Nicolai Palm

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Publications @MCML

[8]
T. Nagler, L. Schneider, B. Bischl and M. Feurer.
Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization.
38th Conference on Neural Information Processing Systems (NeurIPS 2024). Vancouver, Canada, Dec 10-15, 2024. To be published. Preprint at arXiv. arXiv. GitHub.
Abstract

Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed cross-validation scheme, are often recommended. We show that, surprisingly, reshuffling the splits for every configuration often improves the final model's generalization performance on unseen data. Our theoretical analysis explains how reshuffling affects the asymptotic behavior of the validation loss surface and provides a bound on the expected regret in the limiting regime. This bound connects the potential benefits of reshuffling to the signal and noise characteristics of the underlying optimization problem. We confirm our theoretical results in a controlled simulation study and demonstrate the practical usefulness of reshuffling in a large-scale, realistic hyperparameter optimization experiment. While reshuffling leads to test performances that are competitive with using fixed splits, it drastically improves results for a single train-validation holdout protocol and can often make holdout become competitive with standard CV while being computationally cheaper.

MCML Authors
Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[7]
D. Rügamer, C. Kolb, T. Weber, L. Kook and T. Nagler.
Generalizing orthogonalization for models with non-linearities.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

The complexity of black-box algorithms can lead to various challenges, including the introduction of biases. These biases present immediate risks in the algorithms' application. It was, for instance, shown that neural networks can deduce racial information solely from a patient's X-ray scan, a task beyond the capability of medical experts. If this fact is not known to the medical expert, automatic decision-making based on this algorithm could lead to prescribing a treatment (purely) based on racial information. While current methodologies allow for the ''orthogonalization'' or ''normalization'' of neural networks with respect to such information, existing approaches are grounded in linear models. Our paper advances the discourse by introducing corrections for non-linearities such as ReLU activations. Our approach also encompasses scalar and tensor-valued predictions, facilitating its integration into neural network architectures. Through extensive experiments, we validate our method's effectiveness in safeguarding sensitive data in generalized linear models, normalizing convolutional neural networks for metadata, and rectifying pre-existing embeddings for undesired attributes.

MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability


[6]
D. Rundel, J. Kobialka, C. von Crailsheim, M. Feurer, T. Nagler and D. Rügamer.
Interpretable Machine Learning for TabPFN.
2nd World Conference on Explainable Artificial Intelligence (xAI 2024). Valletta, Malta, Jul 17-19, 2024. DOI. GitHub.
Abstract

The recently developed Prior-Data Fitted Networks (PFNs) have shown very promising results for applications in low-data regimes. The TabPFN model, a special case of PFNs for tabular data, is able to achieve state-of-the-art performance on a variety of classification tasks while producing posterior predictive distributions in mere seconds by in-context learning without the need for learning parameters or hyperparameter tuning. This makes TabPFN a very attractive option for a wide range of domain applications. However, a major drawback of the method is its lack of interpretability. Therefore, we propose several adaptations of popular interpretability methods that we specifically design for TabPFN. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations than existing implementations. In particular, we show how in-context learning facilitates the estimation of Shapley values by avoiding approximate retraining and enables the use of Leave-One-Covariate-Out (LOCO) even when working with large-scale Transformers. In addition, we demonstrate how data valuation methods can be used to address scalability challenges of TabPFN.

MCML Authors
Link to David Rundel

David Rundel

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julius Kobialka

Julius Kobialka

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[5]
Y. Sale, P. Hofman, T. Löhr, L. Wimmer, T. Nagler and E. Hüllermeier.
Label-wise Aleatoric and Epistemic Uncertainty Quantification.
40th Conference on Uncertainty in Artificial Intelligence (UAI 2024). Barcelona, Spain, Jul 16-18, 2024. URL.
MCML Authors
Link to Paul Hofman

Paul Hofman

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models


[4]
N. Palm and T. Nagler.
An Online Bootstrap for Time Series.
27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024). Valencia, Spain, May 02-04, 2024. URL.
MCML Authors
Link to Nicolai Palm

Nicolai Palm

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability


[3]
Y. Sale, P. Hofman, L. Wimmer, E. Hüllermeier and T. Nagler.
Second-Order Uncertainty Quantification: Variance-Based Measures.
Preprint at arXiv (Dec. 2023). arXiv.
MCML Authors
Link to Paul Hofman

Paul Hofman

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability


[2]
J. Rodemann, J. Goschenhofer, E. Dorigatti, T. Nagler and T. Augustin.
Approximately Bayes-optimal pseudo-label selection.
39th Conference on Uncertainty in Artificial Intelligence (UAI 2023). Pittsburgh, PA, USA, Aug 01-03, 2023. URL.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability


[1]
T. Nagler.
Statistical Foundations of Prior-Data Fitted Networks.
40th International Conference on Machine Learning (ICML 2023). Honolulu, Hawaii, Jul 23-29, 2023. URL.
MCML Authors
Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability