Home | Research | Groups | David Rügamer

Research Group David Rügamer

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

David Rügamer

is Associate Professor and Group Leader of the Data Science Group at LMU Munich.

His research involves the development of uncertainty quantification approaches for statistical and machine learning models, the unification of concepts from statistics and deep learning, and sparsification of neural networks.

Team members @MCML

Link to Maarten Jung

Maarten Jung

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Julius Kobialka

Julius Kobialka

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Rickmer Schulte

Rickmer Schulte

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Emanuel Sommer

Emanuel Sommer

Data Science Group

A1 | Statistical Foundations & Explainability

Publications @MCML

[66]
D. Rügamer, B. X. W. Liew, Z. Altai and A. Stöcker.
A Functional Extension of Semi-Structured Networks.
38th Conference on Neural Information Processing Systems (NeurIPS 2024). Vancouver, Canada, Dec 10-15, 2024. To be published. Preprint at arXiv. arXiv.
Abstract

Semi-structured networks (SSNs) merge the structures familiar from additive models with deep neural networks, allowing the modeling of interpretable partial feature effects while capturing higher-order non-linearities at the same time. A significant challenge in this integration is maintaining the interpretability of the additive model component. Inspired by large-scale biomechanics datasets, this paper explores extending SSNs to functional data. Existing methods in functional data analysis are promising but often not expressive enough to account for all interactions and non-linearities and do not scale well to large datasets. Although the SSN approach presents a compelling potential solution, its adaptation to functional data remains complex. In this work, we propose a functional SSN method that retains the advantageous properties of classical functional regression approaches while also improving scalability. Our numerical experiments demonstrate that this approach accurately recovers underlying signals, enhances predictive performance, and performs favorably compared to competing methods.

MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[65]
J. G. Wiese, L. Wimmer, T. Papamarkou, B. Bischl, S. Günnemann and D. Rügamer.
Towards Efficient Posterior Sampling in Deep Neural Networks via Symmetry Removal (Extended Abstract).
33rd International Joint Conference on Artificial Intelligence (IJCAI 2024). Jeju, Korea, Aug 03-09, 2024. DOI.
Abstract

Bayesian inference in deep neural networks is challenging due to the high-dimensional, strongly multi-modal parameter posterior density landscape. Markov chain Monte Carlo approaches asymptotically recover the true posterior but are considered prohibitively expensive for large modern architectures. Local methods, which have emerged as a popular alternative, focus on specific parameter regions that can be approximated by functions with tractable integrals. While these often yield satisfactory empirical results, they fail, by definition, to account for the multi-modality of the parameter posterior. In this work, we argue that the dilemma between exact-but-unaffordable and cheap-but-inexact approaches can be mitigated by exploiting symmetries in the posterior landscape. Such symmetries, induced by neuron interchangeability and certain activation functions, manifest in different parameter values leading to the same functional output value. We show theoretically that the posterior predictive density in Bayesian neural networks can be restricted to a symmetry-free parameter reference set. By further deriving an upper bound on the number of Monte Carlo chains required to capture the functional diversity, we propose a straightforward approach for feasible Bayesian inference. Our experiments suggest that efficient sampling is indeed possible, opening up a promising path to accurate uncertainty quantification in deep learning.

MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Stephan Günnemann

Stephan Günnemann

Prof. Dr.

Data Analytics & Machine Learning

A3 | Computational Models

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[64]
F. Ott, L. Heublein, D. Rügamer, B. Bischl and C. Mutschler.
Fusing structure from motion and simulation-augmented pose regression from optical flow for challenging indoor environments.
Journal of Visual Communication and Image Representation 103 (Aug. 2024). DOI.
Abstract

The localization of objects is essential in many applications, such as robotics, virtual and augmented reality, and warehouse logistics. Recent advancements in deep learning have enabled localization using monocular cameras. Traditionally, structure from motion (SfM) techniques predict an object’s absolute position from a point cloud, while absolute pose regression (APR) methods use neural networks to understand the environment semantically. However, both approaches face challenges from environmental factors like motion blur, lighting changes, repetitive patterns, and featureless areas. This study addresses these challenges by incorporating additional information and refining absolute pose estimates with relative pose regression (RPR) methods. RPR also struggles with issues like motion blur. To overcome this, we compute the optical flow between consecutive images using the Lucas–Kanade algorithm and use a small recurrent convolutional network to predict relative poses. Combining absolute and relative poses is difficult due to differences between global and local coordinate systems. Current methods use pose graph optimization (PGO) to align these poses. In this work, we propose recurrent fusion networks to better integrate absolute and relative pose predictions, enhancing the accuracy of absolute pose estimates. We evaluate eight different recurrent units and create a simulation environment to pre-train the APR and RPR networks for improved generalization. Additionally, we record a large dataset of various scenarios in a challenging indoor environment resembling a warehouse with transportation robots. Through hyperparameter searches and experiments, we demonstrate that our recurrent fusion method outperforms PGO in effectiveness.

MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[63]
M. Herrmann, F. J. D. Lange, K. Eggensperger, G. Casalicchio, M. Wever, M. Feurer, D. Rügamer, E. Hüllermeier, A.-L. Boulesteix and B. Bischl.
Position: Why We Must Rethink Empirical Research in Machine Learning.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

We warn against a common but incomplete understanding of empirical research in machine learning (ML) that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field. To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations. In particular, we argue most current empirical ML research is fashioned as confirmatory research while it should rather be considered exploratory.

MCML Authors
Link to Moritz Herrmann

Moritz Herrmann

Dr.

Biometry in Molecular Medicine

Coordinator for Reproducibility & Open Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Marcel Wever

Marcel Wever

Dr.

* Former member

A3 | Computational Models

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[62]
T. Papamarkou, M. Skoularidou, K. Palla, L. Aitchison, J. Arbel, D. Dunson, M. Filippone, V. Fortuin, P. Hennig, J. M. Hernández-Lobato, A. Hubin, A. Immer, T. Karaletsos, M. E. Khan, A. Kristiadi, Y. Li, S. Mandt, C. Nemeth, M. A. Osborne, T. G. J. Rudner, D. Rügamer, Y. W. Teh, M. Welling, A. G. Wilson and R. Zhang.
Position: Bayesian Deep Learning in the Age of Large-Scale AI.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.

MCML Authors
Link to Vincent Fortuin

Vincent Fortuin

Dr.

Bayesian Deep Learning

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[61]
D. Rügamer, C. Kolb, T. Weber, L. Kook and T. Nagler.
Generalizing orthogonalization for models with non-linearities.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

The complexity of black-box algorithms can lead to various challenges, including the introduction of biases. These biases present immediate risks in the algorithms' application. It was, for instance, shown that neural networks can deduce racial information solely from a patient's X-ray scan, a task beyond the capability of medical experts. If this fact is not known to the medical expert, automatic decision-making based on this algorithm could lead to prescribing a treatment (purely) based on racial information. While current methodologies allow for the ''orthogonalization'' or ''normalization'' of neural networks with respect to such information, existing approaches are grounded in linear models. Our paper advances the discourse by introducing corrections for non-linearities such as ReLU activations. Our approach also encompasses scalar and tensor-valued predictions, facilitating its integration into neural network architectures. Through extensive experiments, we validate our method's effectiveness in safeguarding sensitive data in generalized linear models, normalizing convolutional neural networks for metadata, and rectifying pre-existing embeddings for undesired attributes.

MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability


[60]
E. Sommer, L. Wimmer, T. Papamarkou, L. Bothmann, B. Bischl and D. Rügamer.
Connecting the Dots: Is Mode Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

A major challenge in sample-based inference (SBI) for Bayesian neural networks is the size and structure of the networks' parameter space. Our work shows that successful SBI is possible by embracing the characteristic relationship between weight and function space, uncovering a systematic link between overparameterization and the difficulty of the sampling problem. Through extensive experiments, we establish practical guidelines for sampling and convergence diagnosis. As a result, we present a Bayesian deep ensemble approach as an effective solution with competitive performance and uncertainty quantification.

MCML Authors
Link to Emanuel Sommer

Emanuel Sommer

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[59]
D. Rundel, J. Kobialka, C. von Crailsheim, M. Feurer, T. Nagler and D. Rügamer.
Interpretable Machine Learning for TabPFN.
2nd World Conference on Explainable Artificial Intelligence (xAI 2024). Valletta, Malta, Jul 17-19, 2024. DOI. GitHub.
Abstract

The recently developed Prior-Data Fitted Networks (PFNs) have shown very promising results for applications in low-data regimes. The TabPFN model, a special case of PFNs for tabular data, is able to achieve state-of-the-art performance on a variety of classification tasks while producing posterior predictive distributions in mere seconds by in-context learning without the need for learning parameters or hyperparameter tuning. This makes TabPFN a very attractive option for a wide range of domain applications. However, a major drawback of the method is its lack of interpretability. Therefore, we propose several adaptations of popular interpretability methods that we specifically design for TabPFN. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations than existing implementations. In particular, we show how in-context learning facilitates the estimation of Shapley values by avoiding approximate retraining and enables the use of Leave-One-Covariate-Out (LOCO) even when working with large-scale Transformers. In addition, we demonstrate how data valuation methods can be used to address scalability challenges of TabPFN.

MCML Authors
Link to David Rundel

David Rundel

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julius Kobialka

Julius Kobialka

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[58]
L. Kook, P. Schiele, C. Kolb, D. Dold, M. Arpogaus, C. Fritz, P. Baumann, P. Kopper, T. Pielok, E. Dorigatti and D. Rügamer.
How Inverse Conditional Flows Can Serve as a Substitute for Distributional Regression.
40th Conference on Uncertainty in Artificial Intelligence (UAI 2024). Barcelona, Spain, Jul 16-18, 2024. URL.
Abstract

Neural network representations of simple models, such as linear regression, are being studied increasingly to better understand the underlying principles of deep learning algorithms. However, neural representations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse flow transformations (DRIFT), which includes neural representations of the aforementioned models. We empirically demonstrate that the neural representations of models in DRIFT can serve as a substitute for their classical statistical counterparts in several applications involving continuous, ordered, time-series, and survival outcomes. We confirm that models in DRIFT empirically match the performance of several statistical methods in terms of estimation of partial effects, prediction, and aleatoric uncertainty quantification. DRIFT covers both interpretable statistical models and flexible neural networks opening up new avenues in both statistical modeling and deep learning.

MCML Authors
Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Cornelius Fritz

Cornelius Fritz

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[57]
H. Chen, J. Büssing, D. Rügamer and E. Nie.
Leveraging (Sentence) Transformer Models with Contrastive Learning for Identifying Machine-Generated Text.
18th International Workshop on Semantic Evaluation (SemEval 2024) co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024). Mexico City, Mexico, Jun 20-21, 2024. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Ercong Nie

Ercong Nie

Statistical NLP and Deep Learning

B2 | Natural Language Processing


[56]
D. Dold, D. Rügamer, B. Sick and O. Dürr.
Bayesian Semi-structured Subspace Inference.
27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024). Valencia, Spain, May 02-04, 2024. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[55]
D. Rügamer.
Scalable Higher-Order Tensor Product Spline Models.
27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024). Valencia, Spain, May 02-04, 2024. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[54]
A. F. Thielmann, A. Reuter, T. Kneib, D. Rügamer and B. Säfken.
Interpretable Additive Tabular Transformer Networks.
Transactions on Machine Learning Research (May. 2024). URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[53]
T. Weber, J. Dexl, D. Rügamer and M. Ingrisch.
Post-Training Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition.
Preprint at arXiv (Apr. 2024). arXiv.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Jakob Dexl

Jakob Dexl

Clinical Data Science in Radiology

C1 | Medicine

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine


[52]
B. X. Liew, F. Pfisterer, D. Rügamer and X. Zhai.
Strategies to optimise machine learning classification performance when using biomechanical features.
Journal of Biomechanics 165 (Mar. 2024). DOI.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[51]
P. Kopper, D. Rügamer, R. Sonabend, B. Bischl and A. Bender.
Training Survival Models using Scoring Rules.
Preprint at arXiv (Mar. 2024). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability


[50]
B. X. W. Liew, D. Rügamer and A. V. Birn-Jeffery.
Neuromechanical stabilisation of the centre of mass during running.
Gait and Posture 108 (Feb. 2024). DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[49]
D. Schalk, B. Bischl and D. Rügamer.
Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models.
Statistics and Computing 34.31 (Feb. 2024). DOI.
Abstract

Various privacy-preserving frameworks that respect the individual’s privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the $L_2$-loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods.

MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[48]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Constrained Probabilistic Mask Learning for Task-specific Undersampled MRI Reconstruction.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024). Waikoloa, Hawaii, Jan 04-08, 2024. DOI.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[47]
J. Gertheiss, D. Rügamer, B. Liew and S. Greven.
Functional Data Analysis: An Introduction and Recent Developments.
Biometrical Journal (2024). To be published. Preprint at arXiv. arXiv. GitHub.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[46]
L. Kook, P. F. M. Baumann, O. Dürr, B. Sick and D. Rügamer.
Estimating Conditional Distributions with Neural Networks using R package deeptrafo.
Journal of Statistical Software (2024). To be published. Preprint at arXiv. arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[45]
Z. Zhang, H. Yang, B. Ma, D. Rügamer and E. Nie.
Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models.
BabyLM Challenge at 27th Conference on Computational Natural Language Learning (CoNLL 2023). Singapore, Dec 06-10, 2023. DOI. GitHub.
MCML Authors
Link to Bolei Ma

Bolei Ma

Social Data Science and AI Lab

C4 | Computational Social Sciences

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Ercong Nie

Ercong Nie

Statistical NLP and Deep Learning

B2 | Natural Language Processing


[44]
A. T. Stüber, S. Coors, B. Schachtner, T. Weber, D. Rügamer, A. Bender, A. Mittermeier, O. Öcal, M. Seidensticker, J. Ricke, B. Bischl and M. Ingrisch.
A comprehensive machine learning benchmark study for radiomics-based survival analysis of CT imaging data in patients with hepatic metastases of CRC.
Investigative Radiology 58.12 (Dec. 2023). DOI.
MCML Authors
Link to Theresa Stüber

Theresa Stüber

Clinical Data Science in Radiology

C1 | Medicine

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Balthasar Schachtner

Balthasar Schachtner

Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Andreas Mittermeier

Andreas Mittermeier

Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine


[43]
D. Rügamer, F. Pfisterer, B. Bischl and B. Grün.
Mixture of Experts Distributional Regression: Implementation Using Robust Estimation with Adaptive First-order Methods.
Advances in Statistical Analysis (Nov. 2023). DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[42]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Unreading Race: Purging Protected Features from Chest X-ray Embeddings.
Under review. Preprint at arXiv (Nov. 2023). arXiv.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[41]
L. Bothmann, S. Strickroth, G. Casalicchio, D. Rügamer, M. Lindauer, F. Scheipl and B. Bischl.
Developing Open Source Educational Resources for Machine Learning and Data Science.
3rd Teaching Machine Learning and Artificial Intelligence Workshop. Grenoble, France, Sep 19-23, 2023. URL.
MCML Authors
Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Fabian Scheipl

Fabian Scheipl

PD Dr.

Functional Data Analysis

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[40]
J. G. Wiese, L. Wimmer, T. Papamarkou, B. Bischl, S. Günnemann and D. Rügamer.
Towards Efficient MCMC Sampling in Bayesian Neural Networks by Exploiting Symmetry.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. Best paper award. DOI.
MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Stephan Günnemann

Stephan Günnemann

Prof. Dr.

Data Analytics & Machine Learning

A3 | Computational Models

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[39]
B. X. W. Liew, F. M. Kovacs, D. Rügamer and A. Royuela.
Automatic variable selection algorithms in prognostic factor research in neck pain.
Journal of Clinical Medicine (Sep. 2023). DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[38]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Auxiliary Cross-Modal Representation Learning With Triplet Loss Functions for Online Handwriting Recognition.
IEEE Access 11 (Aug. 2023). DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[37]
D. Rügamer.
A New PHO-rmula for Improved Performance of Semi-Structured Networks.
40th International Conference on Machine Learning (ICML 2023). Honolulu, Hawaii, Jul 23-29, 2023. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[36]
C. Kolb, B. Bischl, C. L. Müller and D. Rügamer.
Sparse Modality Regression.
37th International Workshop on Statistical Modelling (IWSM 2023). Dortmund, Germany, Jul 17-21, 2023. Best Paper Award. PDF.
MCML Authors
Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[35]
B. X. W. Liew, D. Rügamer, Q. Mei, Z. Altai, X. Zhu, X. Zhai and N. Cortes.
Smooth and accurate predictions of joint contact force timeseries in gait using overparameterised deep neural networks.
Frontiers in Bioengineering and Biotechnology 11 (Jul. 2023). DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Xiaoxiang Zhu

Xiaoxiang Zhu

Prof. Dr.

Data Science in Earth Observation

C3 | Physics and Geo Sciences


[34]
C. Kolb, C. L. Müller, B. Bischl and D. Rügamer.
Smoothing the Edges: A General Framework for Smooth Optimization in Sparse Regularization using Hadamard Overparametrization.
Under Review (Jul. 2023). arXiv.
MCML Authors
Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[33]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Cascaded Latent Diffusion Models for High-Resolution Chest X-ray Synthesis.
27th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2023). Osaka, Japan, May 25-28, 2023. DOI.
Abstract

While recent advances in large-scale foundational models show promising results, their application to the medical domain has not yet been explored in detail. In this paper, we progress into the realms of large-scale modeling in medical synthesis by proposing Cheff - a foundational cascaded latent diffusion model, which generates highly-realistic chest radiographs providing state-of-the-art quality on a 1-megapixel scale. We further propose MaCheX, which is a unified interface for public chest datasets and forms the largest open collection of chest X-rays up to date. With Cheff conditioned on radiological reports, we further guide the synthesis process over text prompts and unveil the research area of report-to-chest-X-ray generation.

MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[32]
T. Pielok, B. Bischl and D. Rügamer.
Approximate Bayesian Inference with Stein Functional Variational Gradient Descent.
11th International Conference on Learning Representations (ICLR 2023). Kigali, Rwanda, May 01-05, 2023. URL.
MCML Authors
Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[31]
K. Rath, D. Rügamer, B. Bischl, U. von Toussaint and C. Albert.
Dependent state space Student-t processes for imputation and data augmentation in plasma diagnostics.
Contributions to Plasma Physics 63.5-6 (May. 2023). DOI.
MCML Authors
Link to Katharina Rath

Katharina Rath

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[30]
E. Dorigatti, B. Schubert, B. Bischl and D. Rügamer.
Frequentist Uncertainty Quantification in Semi-Structured Neural Networks.
26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023). Valencia, Spain, Apr 25-27, 2023. URL.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[29]
D. Schalk, B. Bischl and D. Rügamer.
Accelerated Componentwise Gradient Boosting Using Efficient Data Representation and Momentum-Based Optimization.
Journal of Computational and Graphical Statistics 32.2 (Apr. 2023). DOI.
MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[28]
D. Rügamer, C. Kolb and N. Klein.
Semi-Structured Distributional Regression.
American Statistician (Feb. 2023). DOI.
Abstract

Combining additive models and neural networks allows to broaden the scope of statistical regression and extends deep learning-based approaches by interpretable structured additive predictors at the same time. Existing approaches uniting the two modeling approaches are, however, limited to very specific combinations and, more importantly, involve an identifiability issue. As a consequence, interpretability and stable estimation is typically lost. We propose a general framework to combine structured regression models and deep neural networks into a unifying network architecture. To overcome the inherent identifiability issues between different model parts, we construct an orthogonalization cell that projects the deep neural network into the orthogonal complement of the statistical model predictor. This enables proper estimation of structured model parts and thereby interpretability. We demonstrate the framework's efficacy in numerical experiments and illustrate its special merits in benchmarks and real-world applications.

MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[27]
D. Rügamer, P. Baumann, T. Kneib and T. Hothorn.
Probabilistic Time Series Forecasts with Autoregressive Transformation Models.
Statistics and Computing 33.2 (Feb. 2023). URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[26]
C. Fritz, G. De Nicola, F. Günther, D. Rügamer, M. Rave, M. Schneble, A. Bender, M. Weigert, R. Brinks, A. Hoyer, U. Berger, H. Küchenhoff and G. Kauermann.
Challenges in Interpreting Epidemiological Surveillance Data – Experiences from Germany.
Journal of Computational and Graphical Statistics 32.3 (Dec. 2022). DOI.
MCML Authors
Link to Cornelius Fritz

Cornelius Fritz

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Maximilian Weigert

Maximilian Weigert

* Former member

C4 | Computational Social Sciences

Link to Helmut Küchenhoff

Helmut Küchenhoff

Prof. Dr.

Statistical Consulting Unit (StaBLab)

C4 | Computational Social Sciences

Link to Göran Kauermann

Göran Kauermann

Prof. Dr.

Applied Statistics in Social Sciences, Economics and Business

A1 | Statistical Foundations & Explainability


[25]
M. Rezaei, E. Dorigatti, D. Rügamer and B. Bischl.
Learning Statistical Representation with Joint Deep Embedded Clustering.
IEEE International Conference on Data Mining Workshops (ICDMW 2022). Orlando, FL, USA, Nov 30-Dec 02, 2022. DOI.
MCML Authors
Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[24]
I. Ziegler, B. Ma, E. Nie, B. Bischl, D. Rügamer, B. Schubert and E. Dorigatti.
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?.
Workshop on Learning Meaningful Representations of Life (LMRL 2022) at the 36th Conference on Neural Information Processing Systems (NeurIPS 2022). New Orleans, LA, USA, Nov 28-Dec 09, 2022. URL.
MCML Authors
Link to Bolei Ma

Bolei Ma

Social Data Science and AI Lab

C4 | Computational Social Sciences

Link to Ercong Nie

Ercong Nie

Statistical NLP and Deep Learning

B2 | Natural Language Processing

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[23]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift.
30th ACM International Conference on Multimedia (MM 2022). Lisbon, Portugal, Oct 10-14, 2022. DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[22]
K. Rath, D. Rügamer, B. Bischl, U. von Toussaint, C. Rea, A. Maris, R. Granetz and C. Albert.
Data augmentation for disruption prediction via robust surrogate models.
Journal of Plasma Physics 88.5 (Oct. 2022). DOI.
Abstract

The goal of this work is to generate large statistically representative data sets to train machine learning models for disruption prediction provided by data from few existing discharges. Such a comprehensive training database is important to achieve satisfying and reliable prediction results in artificial neural network classifiers. Here, we aim for a robust augmentation of the training database for multivariate time series data using Student $t$ process regression. We apply Student $t$ process regression in a state space formulation via Bayesian filtering to tackle challenges imposed by outliers and noise in the training data set and to reduce the computational complexity. Thus, the method can also be used if the time resolution is high. We use an uncorrelated model for each dimension and impose correlations afterwards via colouring transformations. We demonstrate the efficacy of our approach on plasma diagnostics data of three different disruption classes from the DIII-D tokamak. To evaluate if the distribution of the generated data is similar to the training data, we additionally perform statistical analyses using methods from time series analysis, descriptive statistics and classic machine learning clustering algorithms.

MCML Authors
Link to Katharina Rath

Katharina Rath

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[21]
D. Rügamer, A. Bender, S. Wiegrebe, D. Racek, B. Bischl, C. L. Müller and C. Stachl.
Factorized Structured Regression for Large-Scale Varying Coefficient Models.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2022). Grenoble, France, Sep 19-22, 2022. DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology


[20]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Implicit Embeddings via GAN Inversion for High Resolution Chest Radiographs.
1st Workshop on Medical Applications with Disentanglements (MAD 2022) at the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). Singapore, Sep 18-22, 2022. DOI.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[19]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Representation Learning for Tablet and Paper Domain Adaptation in favor of Online Handwriting Recognition.
7th International Workshop on Multimodal pattern recognition of social signals in human computer interaction (MPRSS 2022) at the 26th International Conference on Pattern Recognition (ICPR 2022). Montreal, Canada, Aug 21-25, 2022. arXiv.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[18]
F. Ott, N. L. Raichur, D. Rügamer, T. Feigl, H. Neumann, B. Bischl and C. Mutschler.
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression.
Preprint at arXiv (Aug. 2022). arXiv.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[17]
A. Klaß, S. M. Lorenz, M. W. Lauer-Schmaltz, D. Rügamer, B. Bischl, C. Mutschler and F. Ott.
Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift.
Workshop on Spatio-Temporal Reasoning and Learning (STRL 2022) at the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence (IJCAI-ECAI 2022). Vienna, Austria, Jul 23-29, 2022. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[16]
M. Mittermeier, M. Weigert, D. Rügamer, H. Küchenhoff and R. Ludwig.
A deep learning based classification of atmospheric circulation types over Europe: projection of future changes in a CMIP6 large ensemble.
Environmental Research Letters 17.8 (Jul. 2022). DOI.
MCML Authors
Maximilian Weigert

Maximilian Weigert

* Former member

C4 | Computational Social Sciences

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Helmut Küchenhoff

Helmut Küchenhoff

Prof. Dr.

Statistical Consulting Unit (StaBLab)

C4 | Computational Social Sciences


[15]
P. Kopper, S. Wiegrebe, B. Bischl, A. Bender and D. Rügamer.
DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis.
26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2022). Chengdu, China, May 16-19, 2022. DOI.
Abstract

Survival analysis (SA) is an active field of research that is concerned with time-to-event outcomes and is prevalent in many domains, particularly biomedical applications. Despite its importance, SA remains challenging due to small-scale data sets and complex outcome distributions, concealed by truncation and censoring processes. The piecewise exponential additive mixed model (PAMM) is a model class addressing many of these challenges, yet PAMMs are not applicable in high-dimensional feature settings or in the case of unstructured or multimodal data. We unify existing approaches by proposing DeepPAMM, a versatile deep learning framework that is well-founded from a statistical point of view, yet with enough flexibility for modeling complex hazard structures. We illustrate that DeepPAMM is competitive with other machine learning approaches with respect to predictive performance while maintaining interpretability through benchmark experiments and an extended case study.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[14]
D. Rügamer.
Additive Higher-Order Factorization Machines.
Preprint at arXiv (May. 2022). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[13]
C. Fritz, E. Dorigatti and D. Rügamer.
Combining Graph Neural Networks and Spatio-temporal Disease Models to Predict COVID-19 Cases in Germany.
Scientific Reports 12.3930 (Mar. 2022). DOI.
Abstract

During 2020, the infection rate of COVID-19 has been investigated by many scholars from different research fields. In this context, reliable and interpretable forecasts of disease incidents are a vital tool for policymakers to manage healthcare resources. In this context, several experts have called for the necessity to account for human mobility to explain the spread of COVID-19. Existing approaches often apply standard models of the respective research field, frequently restricting modeling possibilities. For instance, most statistical or epidemiological models cannot directly incorporate unstructured data sources, including relational data that may encode human mobility. In contrast, machine learning approaches may yield better predictions by exploiting these data structures yet lack intuitive interpretability as they are often categorized as black-box models. We propose a combination of both research directions and present a multimodal learning framework that amalgamates statistical regression and machine learning models for predicting local COVID-19 cases in Germany. Results and implications: the novel approach introduced enables the use of a richer collection of data types, including mobility flows and colocation probabilities, and yields the lowest mean squared error scores throughout the observational period in the reported benchmark study. The results corroborate that during most of the observational period more dispersed meeting patterns and a lower percentage of people staying put are associated with higher infection rates. Moreover, the analysis underpins the necessity of including mobility data and showcases the flexibility and interpretability of the proposed approach.

MCML Authors
Link to Cornelius Fritz

Cornelius Fritz

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[12]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022). Waikoloa, Hawaii, Jan 04-08, 2022. DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[11]
F. Ott, D. Rügamer, L. Heublein, T. Hamann, J. Barth, B. Bischl and C. Mutschler.
Benchmarking online sequence-to-sequence and character-based handwriting recognition from IMU-enhanced pens.
International Journal on Document Analysis and Recognition 25.4 (2022). DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[10]
T. Weber, M. Ingrisch, M. Fabritius, B. Bischl and D. Rügamer.
Survival-oriented embeddings for improving accessibility to complex data structures.
Workshop on Bridging the Gap: from Machine Learning Research to Clinical Practice at the 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. arXiv.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[9]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Towards modelling hazard factors in unstructured data spaces using gradient-based latent interpolation.
Workshop on Deep Generative Models and Downstream Applications at the 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. PDF.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[8]
M. Mittermeier, M. Weigert and D. Rügamer.
Identifying the atmospheric drivers of drought and heat using a smoothed deep learning approach.
Workshop on Tackling Climate Change with Machine Learning at the 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. PDF.
Abstract

Europe was hit by several, disastrous heat and drought events in recent summers. Besides thermodynamic influences, such hot and dry extremes are driven by certain atmospheric situations including anticyclonic conditions. Effects of climate change on atmospheric circulations are complex and many open research questions remain in this context, e.g., on future trends of anticyclonic conditions. Based on the combination of a catalog of labeled circulation patterns and spatial atmospheric variables, we propose a smoothed convolutional neural network classifier for six types of anticyclonic circulations that are associated with drought and heat. Our work can help to identify important drivers of hot and dry extremes in climate simulations, which allows to unveil the impact of climate change on these drivers. We address various challenges inherent to circulation pattern classification that are also present in other climate patterns, e.g., subjective labels and unambiguous transition periods.

MCML Authors
Maximilian Weigert

Maximilian Weigert

* Former member

C4 | Computational Social Sciences

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[7]
S. Coors, D. Schalk, B. Bischl and D. Rügamer.
Automatic Componentwise Boosting: An Interpretable AutoML System.
Automating Data Science Workshop (ADS 2021) at the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD 2021). Virtual, Sep 13-17, 2021. arXiv.
Abstract

In practice, machine learning (ML) workflows require various different steps, from data preprocessing, missing value imputation, model selection, to model tuning as well as model evaluation. Many of these steps rely on human ML experts. AutoML - the field of automating these ML pipelines - tries to help practitioners to apply ML off-the-shelf without any expert knowledge. Most modern AutoML systems like auto-sklearn, H20-AutoML or TPOT aim for high predictive performance, thereby generating ensembles that consist almost exclusively of black-box models. This, in turn, makes the interpretation for the layperson more intricate and adds another layer of opacity for users. We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm. Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions, allows for a straightforward calculation of feature importance, and gives insights into the required model complexity to fit the given task. We introduce the general framework and outline its implementation autocompboost. To demonstrate the frameworks efficacy, we compare autocompboost to other existing systems based on the OpenML AutoML-Benchmark. Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets while being more user-friendly and transparent.

MCML Authors
Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[6]
P. Kopper, S. Pölsterl, C. Wachinger, B. Bischl, A. Bender and D. Rügamer.
Semi-Structured Deep Piecewise Exponential Models.
AAAI Spring Symposium Series on Survival Prediction: Algorithms, Challenges and Applications (AAAI-SPACA 2021). Palo Alto, California, USA, Mar 21-24, 2021. PDF.
Abstract

We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning. The presented framework is based on piecewise expo-nential models and thereby supports various survival tasks, such as competing risks and multi-state modeling, and further allows for estimation of time-varying effects and time-varying features. To also include multiple data sources and higher-order interaction effects into the model, we embed the model class in a neural network and thereby enable the si-multaneous estimation of both inherently interpretable structured regression inputs as well as deep neural network components which can potentially process additional unstructured data sources. A proof of concept is provided by using the framework to predict Alzheimer’s disease progression based on tabular and 3D point cloud data and applying it to synthetic data.

MCML Authors
Link to Christian Wachinger

Christian Wachinger

Prof. Dr.

Artificial Intelligence in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[5]
J. Goschenhofer, R. Hvingelby, D. Rügamer, J. Thomas, M. Wagner and B. Bischl.
Deep Semi-Supervised Learning for Time Series Classification.
Preprint at arXiv (Feb. 2021). arXiv.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[4]
D. Rügamer, F. Pfisterer and P. Baumann.
deepregression: Fitting Semi-Structured Deep Distributional Regression in R.
2021. GitHub.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[3]
P. F. M. Baumann, T. Hothorn and D. Rügamer.
Deep Conditional Transformation Models.
Preprint at arXiv (Oct. 2020). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[2]
D. Rügamer, F. Pfisterer and B. Bischl.
Neural Mixture Distributional Regression.
Preprint at arXiv (Oct. 2020). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[1]
A. Bender, D. Rügamer, F. Scheipl and B. Bischl.
A General Machine Learning Framework for Survival Analysis.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2020). Virtual, Sep 14-18, 2020. DOI.
MCML Authors
Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Fabian Scheipl

Fabian Scheipl

PD Dr.

Functional Data Analysis

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability