Home | Research | Groups | Bernd Bischl

Research Group Bernd Bischl

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Bernd Bischl

holds the Chair of Statistical Learning and Data Science at the Department of Statistics at LMU Munich.

He studied Computer Science, Artificial Intelligence and Data Sciences in Hamburg, Edinburgh and Dortmund and obtained his PhD from Dortmund Technical University in 2013 with a thesis on "Model and Algorithm Selection in Statistical Learning and Optimization". His research interests include AutoML, Model Selection, Interpretable ML, as well as the development of Statistical Software. He is a member of ELLIS in general, and a faculty member of ELLIS Munich, an active developer of several R-packages, leads the "mlr" (Machine Learning in R) engineering group and is co-founder of the science platform "OpenML" for open and reproducible ML. Furthermore, he leads the Munich branch of the Fraunhofer ADA Lovelace Center for Analytics, Data & Applications, i.e. a new type of research infrastructure to support businesses in Bavaria, especially in the SME sector.

Team members @MCML

Link to Helen Alber

Helen Alber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Salem Ayadi

Salem Ayadi

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Philip Amir Boustani

Philip Amir Boustani

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Lukas Burk

Lukas Burk

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Fiona Ewald

Fiona Ewald

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Sebastian Fischer

Sebastian Fischer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Esteban Garces Arias

Esteban Garces Arias

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Karl

Florian Karl

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Yawei Li

Yawei Li

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Niebisch

Julia Niebisch

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Katharina Rath

Katharina Rath

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to David Rundel

David Rundel

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Publications @MCML

[146]
T. Nagler, L. Schneider, B. Bischl and M. Feurer.
Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization.
38th Conference on Neural Information Processing Systems (NeurIPS 2024). Vancouver, Canada, Dec 10-15, 2024. To be published. Preprint at arXiv. arXiv. GitHub.
Abstract

Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed cross-validation scheme, are often recommended. We show that, surprisingly, reshuffling the splits for every configuration often improves the final model's generalization performance on unseen data. Our theoretical analysis explains how reshuffling affects the asymptotic behavior of the validation loss surface and provides a bound on the expected regret in the limiting regime. This bound connects the potential benefits of reshuffling to the signal and noise characteristics of the underlying optimization problem. We confirm our theoretical results in a controlled simulation study and demonstrate the practical usefulness of reshuffling in a large-scale, realistic hyperparameter optimization experiment. While reshuffling leads to test performances that are competitive with using fixed splits, it drastically improves results for a single train-validation holdout protocol and can often make holdout become competitive with standard CV while being computationally cheaper.

MCML Authors
Link to Thomas Nagler

Thomas Nagler

Prof. Dr.

Computational Statistics & Data Science

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[145]
Y. Zhang, Y. Li, X. Wang, Q. Shen, B. Plank, B. Bischl, M. Rezaei and K. Kawaguchi.
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models.
Workshop on Machine Learning and Compression at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024). Vancouver, Canada, Dec 10-15, 2024. To be published. Preprint at arXiv.
Abstract

Overparametrized transformer networks are the state-of-the-art architecture for Large Language Models (LLMs). However, such models contain billions of parameters making large compute a necessity, while raising environmental concerns. To address these issues, we propose FinerCut, a new form of fine-grained layer pruning, which in contrast to prior work at the transformer block level, considers all self-attention and feed-forward network (FFN) layers within blocks as individual pruning candidates. FinerCut prunes layers whose removal causes minimal alternation to the model's output -- contributing to a new, lean, interpretable, and task-agnostic pruning method. Tested across 9 benchmarks, our approach retains 90% performance of Llama3-8B with 25% layers removed, and 95% performance of Llama3-70B with 30% layers removed, all without fine-tuning or post-pruning reconstruction. Strikingly, we observe intriguing results with FinerCut: 42% (34 out of 80) of the self-attention layers in Llama3-70B can be removed while preserving 99% of its performance -- without additional fine-tuning after removal. Moreover, FinerCut provides a tool to inspect the types and locations of pruned layers, allowing to observe interesting pruning behaviors. For instance, we observe a preference for pruning self-attention layers, often at deeper consecutive decoder layers. We hope our insights inspire future efficient LLM architecture designs.

MCML Authors
Link to Yawei Li

Yawei Li

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Xinpeng Wang

Xinpeng Wang

Artificial Intelligence and Computational Linguistics

B2 | Natural Language Processing

Link to Barbara Plank

Barbara Plank

Prof. Dr.

Artificial Intelligence and Computational Linguistics

B2 | Natural Language Processing

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[144]
H. Baniecki, G. Casalicchio, B. Bischl and P. Biecek.
On the Robustness of Global Feature Effect Explanations.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024). Vilnius, Lithuania, Sep 09-13, 2024. DOI.
Abstract

We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bounds for evaluating the robustness of partial dependence plots and accumulated local effects. Our experimental results with synthetic and real-world datasets quantify the gap between the best and worst-case scenarios of (mis)interpreting machine learning predictions globally.

MCML Authors
Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[143]
F. Stermann, I. Chalkidis, A. Vahidi, B. Bischl and M. Rezaei.
Attention-Driven Dropout: A Simple Method to Improve Self-supervised Contrastive Sentence Embeddings.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024). Vilnius, Lithuania, Sep 09-13, 2024. DOI.
Abstract

Self-contrastive learning has proven effective for vision and natural language tasks. It aims to learn aligned data representations by encoding similar and dissimilar sentence pairs without human annotation. Therefore, data augmentation plays a crucial role in the learned embedding quality. However, in natural language processing (NLP), creating augmented samples for unsupervised contrastive learning is challenging since random editing may modify the semantic meanings of sentences and thus affect learning good representations. In this paper, we introduce a simple, still effective approach dubbed ADD (Attention-Driven Dropout) to generate better-augmented views of sentences to be used in self-contrastive learning. Given a sentence and a Pre-trained Transformer Language Model (PLM), such as RoBERTa, we use the aggregated attention scores of the PLM to remove the less “informative” tokens from the input. We consider two alternative algorithms based on NAIVEAGGREGATION across layers/heads and ATTENTIONROLLOUT [1]. Our approach significantly improves the overall performance of various self-supervised contrastive-based methods, including SIMCSE [14], DIFFCSE [10], and INFOCSE [33] by facilitating the generation of high-quality positive pairs required by these methods. Through empirical evaluations on multiple Semantic Textual Similarity (STS) and Transfer Learning tasks, we observe enhanced performance across the board.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[142]
A. Vahidi, L. Wimmer, H. A. Gündüz, B. Bischl, E. Hüllermeier and M. Rezaei.
Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024). Vilnius, Lithuania, Sep 09-13, 2024. DOI.
Abstract

Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory demands. In addition, the efficiency of a deep ensemble is related to diversity among the ensemble members, which is challenging for large, over-parameterized deep neural networks. Moreover, ensemble learning has not yet seen such widespread adoption for unsupervised learning and it remains a challenging endeavor for self-supervised or unsupervised representation learning. Motivated by these challenges, we present a novel self-supervised training regime that leverages an ensemble of independent sub-networks, complemented by a new loss function designed to encourage diversity. Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty, all achieved with minimal computational overhead compared to traditional deep self-supervised ensembles. To evaluate the effectiveness of our approach, we conducted extensive experiments across various tasks, including in-distribution generalization, out-of-distribution detection, dataset corruption, and semi-supervised settings. The results demonstrate that our method significantly improves prediction reliability. Our approach not only achieves excellent accuracy but also enhances calibration, improving on important baseline performance across a wide range of self-supervised architectures in computer vision, natural language processing, and genomics data.

MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[141]
M. Aßenmacher, A. Stephan, L. Weissweiler, E. Çano, I. Ziegler, M. Härttrich, B. Bischl, B. Roth, C. Heumann and H. Schütze.
Collaborative Development of Modular Open Source Educational Resources for Natural Language Processing.
6th Workshop on Teaching NLP (TeachingNLP 2024) at the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024). Bangkok, Thailand, Aug 11-16, 2024. URL.
Abstract

In this work, we present a collaboratively and continuously developed open-source educational resource (OSER) for teaching natural language processing at two different universities. We shed light on the principles we followed for the initial design of the course and the rationale for ongoing developments, followed by a reflection on the inter-university collaboration for designing and maintaining teaching material. When reflecting on the latter, we explicitly emphasize the considerations that need to be made when facing heterogeneous groups and when having to accommodate multiple examination regulations within one single course framework. Relying on the fundamental principles of OSER developments as defined by Bothmann et al. (2023) proved to be an important guideline during this process. The final part pertains to open-sourcing our teaching material, coping with the increasing speed of developments in the field, and integrating the course digitally, also addressing conflicting priorities and challenges we are currently facing.

MCML Authors
Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Leonie Weissweiler

Leonie Weissweiler

Dr.

* Former member

B2 | Natural Language Processing

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Hinrich Schütze

Hinrich Schütze

Prof. Dr.

Statistical NLP and Deep Learning

B2 | Natural Language Processing


[140]
J. G. Wiese, L. Wimmer, T. Papamarkou, B. Bischl, S. Günnemann and D. Rügamer.
Towards Efficient Posterior Sampling in Deep Neural Networks via Symmetry Removal (Extended Abstract).
33rd International Joint Conference on Artificial Intelligence (IJCAI 2024). Jeju, Korea, Aug 03-09, 2024. DOI.
Abstract

Bayesian inference in deep neural networks is challenging due to the high-dimensional, strongly multi-modal parameter posterior density landscape. Markov chain Monte Carlo approaches asymptotically recover the true posterior but are considered prohibitively expensive for large modern architectures. Local methods, which have emerged as a popular alternative, focus on specific parameter regions that can be approximated by functions with tractable integrals. While these often yield satisfactory empirical results, they fail, by definition, to account for the multi-modality of the parameter posterior. In this work, we argue that the dilemma between exact-but-unaffordable and cheap-but-inexact approaches can be mitigated by exploiting symmetries in the posterior landscape. Such symmetries, induced by neuron interchangeability and certain activation functions, manifest in different parameter values leading to the same functional output value. We show theoretically that the posterior predictive density in Bayesian neural networks can be restricted to a symmetry-free parameter reference set. By further deriving an upper bound on the number of Monte Carlo chains required to capture the functional diversity, we propose a straightforward approach for feasible Bayesian inference. Our experiments suggest that efficient sampling is indeed possible, opening up a promising path to accurate uncertainty quantification in deep learning.

MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Stephan Günnemann

Stephan Günnemann

Prof. Dr.

Data Analytics & Machine Learning

A3 | Computational Models

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[139]
D. Schalk, R. Rehms, V. S. Hoffmann, B. Bischl and U. Mansmann.
Distributed non-disclosive validation of predictive models by a modified ROC-GLM.
BMC Medical Research Methodology 24.190 (Aug. 2024). DOI.
Abstract

Distributed statistical analyses provide a promising approach for privacy protection when analyzing data distributed over several databases. Instead of directly operating on data, the analyst receives anonymous summary statistics, which are combined into an aggregated result. Further, in discrimination model (prognosis, diagnosis, etc.) development, it is key to evaluate a trained model w.r.t. to its prognostic or predictive performance on new independent data. For binary classification, quantifying discrimination uses the receiver operating characteristics (ROC) and its area under the curve (AUC) as aggregation measure. We are interested to calculate both as well as basic indicators of calibration-in-the-large for a binary classification task using a distributed and privacy-preserving approach...

MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[138]
F. Drost, E. Dorigatti, A. Straub, P. Hilgendorf, K. I. Wagner, K. Heyer, M. López Montes, B. Bischl, D. H. Busch, K. Schober and B. Schubert.
Predicting T cell receptor functionality against mutant epitopes.
Cell Genomics 4.9 (Aug. 2024). DOI.
Abstract

Cancer cells and pathogens can evade T cell receptors (TCRs) via mutations in immunogenic epitopes. TCR cross-reactivity (i.e., recognition of multiple epitopes with sequence similarities) can counteract such escape but may cause severe side effects in cell-based immunotherapies through targeting self-antigens. To predict the effect of epitope point mutations on T cell functionality, we here present the random forest-based model Predicting T Cell Epitope-Specific Activation against Mutant Versions (P-TEAM). P-TEAM was trained and tested on three datasets with TCR responses to single-amino-acid mutations of the model epitope SIINFEKL, the tumor neo-epitope VPSVWRSSL, and the human cytomegalovirus antigen NLVPMVATV, totaling 9,690 unique TCR-epitope interactions. P-TEAM was able to accurately classify T cell reactivities and quantitatively predict T cell functionalities for unobserved single-point mutations and unseen TCRs. Overall, P-TEAM provides an effective computational tool to study T cell responses against mutated epitopes.

MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[137]
F. Ott, L. Heublein, D. Rügamer, B. Bischl and C. Mutschler.
Fusing structure from motion and simulation-augmented pose regression from optical flow for challenging indoor environments.
Journal of Visual Communication and Image Representation 103 (Aug. 2024). DOI.
Abstract

The localization of objects is essential in many applications, such as robotics, virtual and augmented reality, and warehouse logistics. Recent advancements in deep learning have enabled localization using monocular cameras. Traditionally, structure from motion (SfM) techniques predict an object’s absolute position from a point cloud, while absolute pose regression (APR) methods use neural networks to understand the environment semantically. However, both approaches face challenges from environmental factors like motion blur, lighting changes, repetitive patterns, and featureless areas. This study addresses these challenges by incorporating additional information and refining absolute pose estimates with relative pose regression (RPR) methods. RPR also struggles with issues like motion blur. To overcome this, we compute the optical flow between consecutive images using the Lucas–Kanade algorithm and use a small recurrent convolutional network to predict relative poses. Combining absolute and relative poses is difficult due to differences between global and local coordinate systems. Current methods use pose graph optimization (PGO) to align these poses. In this work, we propose recurrent fusion networks to better integrate absolute and relative pose predictions, enhancing the accuracy of absolute pose estimates. We evaluate eight different recurrent units and create a simulation environment to pre-train the APR and RPR networks for improved generalization. Additionally, we record a large dataset of various scenarios in a challenging indoor environment resembling a warehouse with transportation robots. Through hyperparameter searches and experiments, we demonstrate that our recurrent fusion method outperforms PGO in effectiveness.

MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[136]
M. Herrmann, F. J. D. Lange, K. Eggensperger, G. Casalicchio, M. Wever, M. Feurer, D. Rügamer, E. Hüllermeier, A.-L. Boulesteix and B. Bischl.
Position: Why We Must Rethink Empirical Research in Machine Learning.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

We warn against a common but incomplete understanding of empirical research in machine learning (ML) that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field. To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations. In particular, we argue most current empirical ML research is fashioned as confirmatory research while it should rather be considered exploratory.

MCML Authors
Link to Moritz Herrmann

Moritz Herrmann

Dr.

Biometry in Molecular Medicine

Coordinator for Reproducibility & Open Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Marcel Wever

Marcel Wever

Dr.

* Former member

A3 | Computational Models

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[135]
M. Lindauer, F. Karl, A. Klier, J. Moosbauer, A. Tornede, A. C. Mueller, F. Hutter, M. Feurer and B. Bischl.
Position: A Call to Action for a Human-Centered AutoML Paradigm.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML) workflows, aiding the research of new ML algorithms, and contributing to the democratization of ML by making it accessible to a broader audience. Over the past decade, commendable achievements in AutoML have primarily focused on optimizing predictive performance. This focused progress, while substantial, raises questions about how well AutoML has met its broader, original goals. In this position paper, we argue that a key to unlocking AutoML’s full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems, including their diverse roles, expectations, and expertise. We envision a more human-centered approach in future AutoML research, promoting the collaborative design of ML systems that tightly integrates the complementary strengths of human expertise and AutoML methodologies.

MCML Authors
Link to Florian Karl

Florian Karl

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[134]
E. Sommer, L. Wimmer, T. Papamarkou, L. Bothmann, B. Bischl and D. Rügamer.
Connecting the Dots: Is Mode Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?.
41st International Conference on Machine Learning (ICML 2024). Vienna, Austria, Jul 21-27, 2024. URL.
Abstract

A major challenge in sample-based inference (SBI) for Bayesian neural networks is the size and structure of the networks' parameter space. Our work shows that successful SBI is possible by embracing the characteristic relationship between weight and function space, uncovering a systematic link between overparameterization and the difficulty of the sampling problem. Through extensive experiments, we establish practical guidelines for sampling and convergence diagnosis. As a result, we present a Bayesian deep ensemble approach as an effective solution with competitive performance and uncertainty quantification.

MCML Authors
Link to Emanuel Sommer

Emanuel Sommer

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[133]
S. Dandl, K. Blesch, T. Freiesleben, G. König, J. Kapar, B. Bischl and M. Wright.
CountARFactuals -- Generating plausible model-agnostic counterfactual explanations with adversarial random forests.
2nd World Conference on Explainable Artificial Intelligence (xAI 2024). Valletta, Malta, Jul 17-19, 2024. DOI.
MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[132]
F. K. Ewald, L. Bothmann, M. N. Wright, B. Bischl, G. Casalicchio and G. König.
A Guide to Feature Importance Methods for Scientific Inference.
2nd World Conference on Explainable Artificial Intelligence (xAI 2024). Valletta, Malta, Jul 17-19, 2024. DOI.
MCML Authors
Link to Fiona Ewald

Fiona Ewald

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[131]
S. Dandl, M. Becker, B. Bischl, G. Casalicchio and L. Bothmann.
mlr3summary: Concise and interpretable summaries for machine learning models.
Demo Track of the 2nd World Conference on Explainable Artificial Intelligence (xAI 2024). Valletta, Malta, Jul 17-19, 2024. arXiv.
Abstract

This work introduces a novel R package for concise, informative summaries of machine learning models. We take inspiration from the summary function for (generalized) linear models in R, but extend it in several directions: First, our summary function is model-agnostic and provides a unified summary output also for non-parametric machine learning models; Second, the summary output is more extensive and customizable -- it comprises information on the dataset, model performance, model complexity, model's estimated feature importances, feature effects, and fairness metrics;Third, models are evaluated based on resampling strategies for unbiased estimates of model performances, feature importances, etc. Overall, the clear, structured output should help to enhance and expedite the model selection process, making it a helpful tool for practitioners and researchers alike.

MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[130]
S. Dandl, M. Becker, B. Bischl, G. Casalicchio and L. Bothmann.
mlr3summary: Concise and interpretable summaries for machine learning models.
International R User Conference (useR! 2024). Salzburg, Austria, Jul 08-22, 2024. arXiv. GitHub.
Abstract

This work introduces a novel R package for concise, informative summaries of machine learning models. We take inspiration from the summary function for (generalized) linear models in R, but extend it in several directions: First, our summary function is model-agnostic and provides a unified summary output also for non-parametric machine learning models; Second, the summary output is more extensive and customizable -- it comprises information on the dataset, model performance, model complexity, model's estimated feature importances, feature effects, and fairness metrics;Third, models are evaluated based on resampling strategies for unbiased estimates of model performances, feature importances, etc. Overall, the clear, structured output should help to enhance and expedite the model selection process, making it a helpful tool for practitioners and researchers alike.

MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[129]
F. Karl, J. Thomas, J. Elstner, R. Gross and B. Bischl.
Automated Machine Learning.
Unlocking Artificial Intelligence (Jul. 2024). DOI.
MCML Authors
Link to Florian Karl

Florian Karl

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[128]
L. Burk, J. Zobolas, B. Bischl, A. Bender, M. N. Wright and R. Sonabend.
A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data.
Preprint at arXiv (Jun. 2024). arXiv.
Abstract

This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are often narrow in scope, focusing, for example, on high-dimensional data. Additionally, they may lack appropriate tuning or evaluation procedures, or are qualitative reviews, rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable conclusions. We benchmark 18 models, ranging from classical statistical approaches to many common machine learning methods, on 32 publicly available datasets. The benchmark tunes for both a discrimination measure and a proper scoring rule to assess performance in different settings. Evaluating on 8 survival metrics, we assess discrimination, calibration, and overall predictive performance of the tested models. Using discrimination measures, we find that no method significantly outperforms the Cox model. However, (tuned) Accelerated Failure Time models were able to achieve significantly better results with respect to overall predictive performance as measured by the right-censored log-likelihood. Machine learning methods that performed comparably well include Oblique Random Survival Forests under discrimination, and Cox-based likelihood-boosting under overall predictive performance. We conclude that for predictive purposes in the standard survival analysis setting of low-dimensional, right-censored data, the Cox Proportional Hazards model remains a simple and robust method, sufficient for practitioners.

MCML Authors
Link to Lukas Burk

Lukas Burk

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability


[127]
A. Vahidi, S. Schoßer, L. Wimmer, Y. Li, B. Bischl, E. Hüllermeier and M. Rezaei.
Probabilistic Self-supervised Learning via Scoring Rules Minimization.
12th International Conference on Learning Representations (ICLR 2024). Vienna, Austria, May 07-11, 2024. URL. GitHub.
Abstract

In this paper, we propose a novel probabilistic self-supervised learning via Scoring Rule Minimization (ProSMIN), which leverages the power of probabilistic models to enhance representation quality and mitigate collapsing representations. Our proposed approach involves two neural networks; the online network and the target network, which collaborate and learn the diverse distribution of representations from each other through knowledge distillation. By presenting the input samples in two augmented formats, the online network is trained to predict the target network representation of the same sample under a different augmented view. The two networks are trained via our new loss function based on proper scoring rules. We provide a theoretical justification for ProSMIN's convergence, demonstrating the strict propriety of its modified scoring rule. This insight validates the method's optimization process and contributes to its robustness and effectiveness in improving representation quality. We evaluate our probabilistic model on various downstream tasks, such as in-distribution generalization, out-of-distribution detection, dataset corruption, low-shot learning, and transfer learning. Our method achieves superior accuracy and calibration, surpassing the self-supervised baseline in a wide range of experiments on large-scale datasets like ImageNet-O and ImageNet-C, ProSMIN demonstrates its scalability and real-world applicability.

MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Yawei Li

Yawei Li

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[126]
R. Kohli, M. Feurer, B. Bischl, K. Eggensperger and F. Hutter.
Towards Quantifying the Effect of Datasets for Benchmarking: A Look at Tabular Machine Learning.
Workshop on Data-centric Machine Learning Research (DMLR 2024) at the 12th International Conference on Learning Representations (ICLR 2024). Vienna, Austria, May 07-11, 2024. URL.
Abstract

Data in tabular form makes up a large part of real-world ML applications, and thus, there has been a strong interest in developing novel deep learning (DL) architectures for supervised learning on tabular data in recent years. As a result, there is a debate as to whether DL methods are superior to the ubiquitous ensembles of boosted decision trees. Typically, the advantage of one model class over the other is claimed based on an empirical evaluation, where different variations of both model classes are compared on a set of benchmark datasets that supposedly resemble relevant real-world tabular data. While the landscape of state-of- the-art models for tabular data changed, one factor has remained largely constant over the years: The datasets. Here, we examine 30 recent publications and 187 different datasets they use, in terms of age, study size and relevance. We found that the average study used less than 10 datasets and that half of the datasets are older than 20 years. Our insights raise questions about the conclusions drawn from previous studies and urge the research community to develop and publish additional recent, challenging and relevant datasets and ML tasks for supervised learning on tabular data.

MCML Authors
Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[125]
H. A. Gündüz, R. Mreches, J. Moosbauer, G. Robertson, X.-Y. To, E. A. Franzosa, C. Huttenhower, M. Rezaei, A. C. McHardy, B. Bischl, P. C. Münch and M. Binder.
Optimized model architectures for deep learning on genomic data.
Communications Biology 7.1 (Apr. 2024). DOI.
MCML Authors
Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability


[124]
P. Kopper, D. Rügamer, R. Sonabend, B. Bischl and A. Bender.
Training Survival Models using Scoring Rules.
Preprint at arXiv (Mar. 2024). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability


[123]
H. Weerts, F. Pfisterer, M. Feurer, K. Eggensperger, E. Bergman, N. Awad, J. Vanschoren, M. Pechenizkiy, B. Bischl and F. Hutter.
Can Fairness be Automated? Guidelines and Opportunities for Fairness-aware AutoML.
Journal of Artificial Intelligence Research 79 (Feb 17, 2024). DOI.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[122]
S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl and A. Bender.
Deep learning for survival analysis: a review.
Artificial Intelligence Review 57.65 (Feb. 2024). DOI.
Abstract

The influx of deep learning (DL) techniques into the field of survival analysis in recent years has led to substantial methodological progress; for instance, learning from unstructured or high-dimensional data such as images, text or omics data. In this work, we conduct a comprehensive systematic review of DL-based methods for time-to-event analysis, characterizing them according to both survival- and DL-related attributes. In summary, the reviewed methods often address only a small subset of tasks relevant to time-to-event data—e.g., single-risk right-censored data—and neglect to incorporate more complex settings.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability


[121]
P. Gijsbers, M. L. P. Bueno, S. Coors, E. LeDell, S. Poirier, J. Thomas, B. Bischl and J. Vanschoren.
AMLB: an AutoML Benchmark.
Journal of Machine Learning Research 25.101 (Feb. 2024). URL.
Abstract

Comparing different AutoML frameworks is notoriously challenging and often done incorrectly. We introduce an open and extensible benchmark that follows best practices and avoids common mistakes when comparing AutoML frameworks. We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks. The differences between the AutoML frameworks are explored with a multi-faceted analysis, evaluating model accuracy, its trade-offs with inference time, and framework failures. We also use Bradley-Terry trees to discover subsets of tasks where the relative AutoML framework rankings differ. The benchmark comes with an open-source tool that integrates with many AutoML frameworks and automates the empirical evaluation process end-to-end: from framework installation and resource allocation to in-depth evaluation. The benchmark uses public data sets, can be easily extended with other AutoML frameworks and tasks, and has a website with up-to-date results.

MCML Authors
Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[120]
D. Schalk, B. Bischl and D. Rügamer.
Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models.
Statistics and Computing 34.31 (Feb. 2024). DOI.
Abstract

Various privacy-preserving frameworks that respect the individual’s privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the $L_2$-loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods.

MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[119]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Constrained Probabilistic Mask Learning for Task-specific Undersampled MRI Reconstruction.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024). Waikoloa, Hawaii, Jan 04-08, 2024. DOI.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[118]
B. Bischl, R. Sonabend, L. Kotthoff and M. Lang.
Applied Machine Learning Using mlr3 in R.
Chapman and Hall/CRC (Jan. 2024). DOI.
Abstract

mlr3 is an award-winning ecosystem of R packages that have been developed to enable state-of-the-art machine learning capabilities in R. Applied Machine Learning Using mlr3 in R gives an overview of flexible and robust machine learning methods, with an emphasis on how to implement them using mlr3 in R. It covers various key topics, including basic machine learning tasks, such as building and evaluating a predictive model; hyperparameter tuning of machine learning approaches to obtain peak performance; building machine learning pipelines that perform complex operations such as pre-processing followed by modelling followed by aggregation of predictions; and extending the mlr3 ecosystem with custom learners, measures, or pipeline components. The book is primarily aimed at researchers, practitioners, and graduate students who use machine learning or who are interested in using it. It can be used as a textbook for an introductory or advanced machine learning class that uses R, as a reference for people who work with machine learning methods, and in industry for exploratory experiments in machine learning.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[117]
L. Bothmann, K. Peters and B. Bischl.
What Is Fairness? On the Role of Protected Attributes and Fictitious Worlds.
Preprint at arXiv (Jan. 2024). arXiv.
Abstract

A growing body of literature in fairness-aware machine learning (fairML) aims to mitigate machine learning (ML)-related unfairness in automated decision-making (ADM) by defining metrics that measure fairness of an ML model and by proposing methods to ensure that trained ML models achieve low scores on these metrics. However, the underlying concept of fairness, i.e., the question of what fairness is, is rarely discussed, leaving a significant gap between centuries of philosophical discussion and the recent adoption of the concept in the ML community. In this work, we try to bridge this gap by formalizing a consistent concept of fairness and by translating the philosophical considerations into a formal framework for the training and evaluation of ML models in ADM systems. We argue that fairness problems can arise even without the presence of protected attributes (PAs), and point out that fairness and predictive performance are not irreconcilable opposites, but that the latter is necessary to achieve the former. Furthermore, we argue why and how causal considerations are necessary when assessing fairness in the presence of PAs by proposing a fictitious, normatively desired (FiND) world in which PAs have no causal effects. In practice, this FiND world must be approximated by a warped world in which the causal effects of the PAs are removed from the real-world data. Finally, we achieve greater linguistic clarity in the discussion of fairML. We outline algorithms for practical applications and present illustrative experiments on COMPAS data.

MCML Authors
Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[116]
H. A. Gündüz, S. Giri, M. Binder, B. Bischl and M. Rezaei.
Uncertainty Quantification for Deep Learning Models Predicting the Regulatory Activity of DNA Sequences.
22nd IEEE International Conference on Machine Learning and Applications (ICMLA 2023). Jacksonville, Florida, USA, Dec 15-17, 2023. DOI.
Abstract

The field of computational biology has been enhanced by deep learning models, which hold great promise for revolutionizing domains such as protein folding and drug discovery. Recent studies have underscored the tremendous potential of these models, particularly in the realm of gene regulation and the more profound understanding of the non-coding regions of the genome. On the other hand, this raises significant concerns about the reliability and efficacy of such models, which have their own biases by design, along with those learned from the data. Uncertainty quantification allows us to measure where the system is confident and know when it can be trusted. In this paper, we study several uncertainty quantification methods with respect to a multi-target regression task, specifically predicting regulatory activity profiles using DNA sequence data. Using the Basenji model, we investigate how such methods can improve in-domain generalization, out-of-distribution detection, and provide coverage guarantees on prediction intervals.

MCML Authors
Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[115]
F. Karl, T. Pielok, J. Moosbauer, F. Pfisterer, S. Coors, M. Binder, L. Schneider, J. Thomas, J. Richter, M. Lang, E. C. Garrido-Merchán, J. Branke and B. Bischl.
Multi-Objective Hyperparameter Optimization in Machine Learning—An Overview.
ACM Transactions on Evolutionary Learning and Optimization 3.4 (Dec. 2023). DOI.
MCML Authors
Link to Florian Karl

Florian Karl

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[114]
A. T. Stüber, S. Coors, B. Schachtner, T. Weber, D. Rügamer, A. Bender, A. Mittermeier, O. Öcal, M. Seidensticker, J. Ricke, B. Bischl and M. Ingrisch.
A comprehensive machine learning benchmark study for radiomics-based survival analysis of CT imaging data in patients with hepatic metastases of CRC.
Investigative Radiology 58.12 (Dec. 2023). DOI.
MCML Authors
Link to Theresa Stüber

Theresa Stüber

Clinical Data Science in Radiology

C1 | Medicine

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Balthasar Schachtner

Balthasar Schachtner

Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Andreas Mittermeier

Andreas Mittermeier

Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine


[113]
C. A. Scholbeck, J. Moosbauer, G. Casalicchio, H. Gupta, B. Bischl and C. Heumann.
Position Paper: Bridging the Gap Between Machine Learning and Sensitivity Analysis.
Preprint at arXiv (Dec. 2023). arXiv.
MCML Authors
Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[112]
D. Rügamer, F. Pfisterer, B. Bischl and B. Grün.
Mixture of Experts Distributional Regression: Implementation Using Robust Estimation with Adaptive First-order Methods.
Advances in Statistical Analysis (Nov. 2023). DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[111]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Unreading Race: Purging Protected Features from Chest X-ray Embeddings.
Under review. Preprint at arXiv (Nov. 2023). arXiv.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[110]
R. Hornung, M. Nalenz, L. Schneider, A. Bender, L. Bothmann, B. Bischl, T. Augustin and A.-L. Boulesteix.
Evaluating machine learning models in non-standard settings: An overview and new findings.
Preprint at arXiv (Oct. 2023). arXiv.
MCML Authors
Link to Roman Hornung

Roman Hornung

Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability


[109]
H. Löwe, C. A. Scholbeck, C. Heumann, B. Bischl and G. Casalicchio.
fmeffects: An R Package for Forward Marginal Effects.
Preprint at arXiv (Oct. 2023). arXiv.
MCML Authors
Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[108]
L. Bothmann, S. Strickroth, G. Casalicchio, D. Rügamer, M. Lindauer, F. Scheipl and B. Bischl.
Developing Open Source Educational Resources for Machine Learning and Data Science.
3rd Teaching Machine Learning and Artificial Intelligence Workshop. Grenoble, France, Sep 19-23, 2023. URL.
MCML Authors
Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Fabian Scheipl

Fabian Scheipl

PD Dr.

Functional Data Analysis

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[107]
I. T. Öztürk, R. Nedelchev, C. Heumann, E. Garces Arias, M. Roger, B. Bischl and M. Aßenmacher.
How Different Is Stereotypical Bias Across Languages?.
3rd Workshop on Bias and Fairness in AI (BIAS 2023) co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. arXiv.
MCML Authors
Link to Esteban Garces Arias

Esteban Garces Arias

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[106]
M. Aßenmacher, L. Rauch, J. Goschenhofer, A. Stephan, B. Bischl, B. Roth and B. Sick.
Towards Enhancing Deep Active Learning with Weak Supervision and Constrained Clustering.
7th International Workshop on Interactive Adaptive Learning (IAL 2023) co-located with the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. URL.
MCML Authors
Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[105]
S. Dandl, G. Casalicchio, B. Bischl and L. Bothmann.
Interpretable Regional Descriptors: Hyperbox-Based Local Explanations.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. DOI.
MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[104]
L. Rauch, M. Aßenmacher, D. Huseljic, M. Wirth, B. Bischl and B. Sick.
ActiveGLAE: A Benchmark for Deep Active Learning with Transformers.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. DOI.
MCML Authors
Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[103]
J. G. Wiese, L. Wimmer, T. Papamarkou, B. Bischl, S. Günnemann and D. Rügamer.
Towards Efficient MCMC Sampling in Bayesian Neural Networks by Exploiting Symmetry.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Turin, Italy, Sep 18-22, 2023. Best paper award. DOI.
MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Stephan Günnemann

Stephan Günnemann

Prof. Dr.

Data Analytics & Machine Learning

A3 | Computational Models

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[102]
S. F. Fischer, L. Harutyunyan, M. Feurer and B. Bischl.
OpenML-CTR23 - A curated tabular regression benchmarking suite.
International Conference on Automated Machine Learning (AutoML 2023) - Workshop Track. Berlin, Germany, Sep 12-15, 2023. URL.
MCML Authors
Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[101]
L. O. Purucker, L. Schneider, M. Anastacio, J. Beel, B. Bischl and H. Hoos.
Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML.
International Conference on Automated Machine Learning (AutoML 2023). Berlin, Germany, Sep 12-15, 2023. URL.
MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[100]
S. Segel, H. Graf, A. Tornede, B. Bischl and M. Lindauer.
Symbolic Explanations for Hyperparameter Optimization.
International Conference on Automated Machine Learning (AutoML 2023). Berlin, Germany, Sep 12-15, 2023. URL.
Abstract

Hyperparameter optimization (HPO) methods can determine well-performing hyperparameter configurations efficiently but often lack insights and transparency. We propose to apply symbolic regression to meta-data collected with Bayesian optimization (BO) during HPO. In contrast to prior approaches explaining the effects of hyperparameters on model performance, symbolic regression allows for obtaining explicit formulas quantifying the relation between hyperparameter values and model performance. Overall, our approach aims to make the HPO process more explainable and human-centered, addressing the needs of multiple user groups: First, providing insights into the HPO process can support data scientists and machine learning practitioners in their decisions when using and interacting with HPO tools. Second, obtaining explicit formulas and inspecting their properties could help researchers understand the HPO loss landscape better. In an experimental evaluation, we find that naively applying symbolic regression directly to meta-data collected during HPO is affected by the sampling bias introduced by BO. However, the true underlying loss landscape can be approximated by fitting the symbolic regression on the surrogate model trained during BO. By penalizing longer formulas, symbolic regression furthermore allows the user to decide how to balance the accuracy and explainability of the resulting formulas.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[99]
H. A. Gündüz, M. Binder, X.-Y. To, R. Mreches, B. Bischl, A. C. McHardy, P. C. Münch and M. Rezaei.
A self-supervised deep learning method for data-efficient training in genomics.
Communications Biology 6.928 (Sep. 2023). DOI.
MCML Authors
Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[98]
R. P. Prager, K. Dietrich, L. Schneider, L. Schäpermeier, B. Bischl, P. Kerschke, H. Trautmann and O. Mersmann.
Neural Networks as Black-Box Benchmark Functions Optimized for Exploratory Landscape Features.
17th ACM/SIGEVO Conference on Foundations of Genetic Algorithms (FOGA 2023). Potsdam, Germany, Aug 30-Sep 01, 2023. DOI.
Abstract

Artificial benchmark functions are commonly used in optimization research because of their ability to rapidly evaluate potential solutions, making them a preferred substitute for real-world problems. However, these benchmark functions have faced criticism for their limited resemblance to real-world problems. In response, recent research has focused on automatically generating new benchmark functions for areas where established test suites are inadequate. These approaches have limitations, such as the difficulty of generating new benchmark functions that exhibit exploratory landscape analysis (ELA) features beyond those of existing benchmarks. The objective of this work is to develop a method for generating benchmark functions for single-objective continuous optimization with user-specified structural properties. Specifically, we aim to demonstrate a proof of concept for a method that uses an ELA feature vector to specify these properties in advance. To achieve this, we begin by generating a random sample of decision space variables and objective values. We then adjust the objective values using CMA-ES until the corresponding features of our new problem match the predefined ELA features within a specified threshold. By iteratively transforming the landscape in this way, we ensure that the resulting function exhibits the desired properties. To create the final function, we use the resulting point cloud as training data for a simple neural network that produces a function exhibiting the target ELA features. We demonstrate the effectiveness of this approach by replicating the existing functions of the well-known BBOB suite and creating new functions with ELA feature values that are not present in BBOB.

MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[97]
A. Scheppach, H. A. Gündüz, E. Dorigatti, P. C. Münch, A. C. McHardy, B. Bischl, M. Rezaei and M. Binder.
Neural Architecture Search for Genomic Sequence Data.
20th IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology (CIBCB 2023). Eindhoven, The Netherlands, Aug 29-31, 2023. DOI.
Abstract

Deep learning has enabled outstanding progress on bioinformatics datasets and a variety of tasks, such as protein structure prediction, identification of regulatory regions, genome annotation, and interpretation of the noncoding genome. The layout and configuration of neural networks used for these tasks have mostly been developed manually by human experts, which is a time-consuming and error-prone process. Therefore, there is growing interest in automated neural architecture search (NAS) methods in bioinformatics. In this paper, we present a novel search space for NAS algorithms that operate on genome data, thus creating extensions for existing NAS algorithms for sequence data that we name Genome-DARTS, Genome-P-DARTS, Genome-BONAS, Genome-SH, and Genome-RS. Moreover, we introduce two novel NAS algorithms, CWP-DARTS and EDPDARTS, that build on and extend the idea of P-DARTS. We evaluate the presented methods and compare them to manually designed neural architectures on a widely used genome sequence machine learning task to show that NAS methods can be adapted well for bioinformatics sequence datasets. Our experiments show that architectures optimized by our NAS methods outperform manually developed architectures while having significantly fewer parameters.

MCML Authors
Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability


[96]
L. Wimmer, Y. Sale, P. Hofman, B. Bischl and E. Hüllermeier.
Quantifying Aleatoric and Epistemic Uncertainty in Machine Learning: Are Conditional Entropy and Mutual Information Appropriate Measures?.
39th Conference on Uncertainty in Artificial Intelligence (UAI 2023). Pittsburgh, PA, USA, Aug 01-03, 2023. URL.
MCML Authors
Link to Lisa Wimmer

Lisa Wimmer

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Paul Hofman

Paul Hofman

Artificial Intelligence & Machine Learning

A3 | Computational Models

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Eyke Hüllermeier

Eyke Hüllermeier

Prof. Dr.

Artificial Intelligence & Machine Learning

A3 | Computational Models


[95]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Auxiliary Cross-Modal Representation Learning With Triplet Loss Functions for Online Handwriting Recognition.
IEEE Access 11 (Aug. 2023). DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[94]
Y. Li, Y. Zhang, K. Kawaguchi, A. Khakzar, B. Bischl and M. Rezaei.
A Dual-Perspective Approach to Evaluating Feature Attribution Methods.
Preprint at arXiv (Aug. 2023). arXiv.
Abstract

Feature attribution methods attempt to explain neural network predictions by identifying relevant features. However, establishing a cohesive framework for assessing feature attribution remains a challenge. There are several views through which we can evaluate attributions. One principal lens is to observe the effect of perturbing attributed features on the model's behavior (i.e., faithfulness). While providing useful insights, existing faithfulness evaluations suffer from shortcomings that we reveal in this paper. In this work, we propose two new perspectives within the faithfulness paradigm that reveal intuitive properties: soundness and completeness. Soundness assesses the degree to which attributed features are truly predictive features, while completeness examines how well the resulting attribution reveals all the predictive features. The two perspectives are based on a firm mathematical foundation and provide quantitative metrics that are computable through efficient algorithms. We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods.

MCML Authors
Link to Yawei Li

Yawei Li

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Ashkan Khakzar

Ashkan Khakzar

Dr.

* Former member

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[93]
C. Molnar, T. Freiesleben, G. König, J. Herbinger, T. Reisinger, G. Casalicchio, M. N. Wright and B. Bischl.
Relating the Partial Dependence Plot and Permutation Feature Importance to the Data Generating Process.
1st World Conference on eXplainable Artificial Intelligence (xAI 2023). Lisbon, Portugal, Jul 26-28, 2023. DOI.
Abstract

Scientists and practitioners increasingly rely on machine learning to model data and draw conclusions. Compared to statistical modeling approaches, machine learning makes fewer explicit assumptions about data structures, such as linearity. However, their model parameters usually cannot be easily related to the data generating process. To learn about the modeled relationships, partial dependence (PD) plots and permutation feature importance (PFI) are often used as interpretation methods. However, PD and PFI lack a theory that relates them to the data generating process. We formalize PD and PFI as statistical estimators of ground truth estimands rooted in the data generating process. We show that PD and PFI estimates deviate from this ground truth due to statistical biases, model variance and Monte Carlo approximation errors. To account for model variance in PD and PFI estimation, we propose the learner-PD and the learner-PFI based on model refits, and propose corrected variance and confidence interval estimators.

MCML Authors
Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[92]
J. Goschenhofer, B. Bischl and Z. Kira.
ConstraintMatch for Semi-constrained Clustering.
International Joint Conference on Neural Networks (IJCNN 2023). Gold Coast Convention and Exhibition Centre, Queensland, Australia, Jul 18-23, 2023. DOI.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[91]
C. Kolb, B. Bischl, C. L. Müller and D. Rügamer.
Sparse Modality Regression.
37th International Workshop on Statistical Modelling (IWSM 2023). Dortmund, Germany, Jul 17-21, 2023. Best Paper Award. PDF.
MCML Authors
Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[90]
L. Schneider, B. Bischl and J. Thomas.
Multi-Objective Optimization of Performance and Interpretability of Tabular Supervised Machine Learning Models.
Genetic and Evolutionary Computation Conference (GECCO 2023). Lisbon, Portugal, Jul 15-19, 2023. DOI.
MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[89]
C. Kolb, C. L. Müller, B. Bischl and D. Rügamer.
Smoothing the Edges: A General Framework for Smooth Optimization in Sparse Regularization using Hadamard Overparametrization.
Under Review (Jul. 2023). arXiv.
MCML Authors
Link to Chris Kolb

Chris Kolb

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[88]
M. Rezaei, A. Vahidi, T. Elze, B. Bischl and M. Eslami.
Self-supervised Learning and Self-labeling Framework for Glaucoma Detection.
Investigative Ophthalmology and Visual Science 64.8 (Jun. 2023). URL.
MCML Authors
Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[87]
J. Herbinger, B. Bischl and G. Casalicchio.
Decomposing Global Feature Effects Based on Feature Interactions.
Preprint at arXiv (Jun. 2023). arXiv.
MCML Authors
Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[86]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Cascaded Latent Diffusion Models for High-Resolution Chest X-ray Synthesis.
27th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2023). Osaka, Japan, May 25-28, 2023. DOI.
Abstract

While recent advances in large-scale foundational models show promising results, their application to the medical domain has not yet been explored in detail. In this paper, we progress into the realms of large-scale modeling in medical synthesis by proposing Cheff - a foundational cascaded latent diffusion model, which generates highly-realistic chest radiographs providing state-of-the-art quality on a 1-megapixel scale. We further propose MaCheX, which is a unified interface for public chest datasets and forms the largest open collection of chest X-rays up to date. With Cheff conditioned on radiological reports, we further guide the synthesis process over text prompts and unveil the research area of report-to-chest-X-ray generation.

MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[85]
T. Pielok, B. Bischl and D. Rügamer.
Approximate Bayesian Inference with Stein Functional Variational Gradient Descent.
11th International Conference on Learning Representations (ICLR 2023). Kigali, Rwanda, May 01-05, 2023. URL.
MCML Authors
Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[84]
K. Rath, D. Rügamer, B. Bischl, U. von Toussaint and C. Albert.
Dependent state space Student-t processes for imputation and data augmentation in plasma diagnostics.
Contributions to Plasma Physics 63.5-6 (May. 2023). DOI.
MCML Authors
Link to Katharina Rath

Katharina Rath

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[83]
E. Dorigatti, B. Schubert, B. Bischl and D. Rügamer.
Frequentist Uncertainty Quantification in Semi-Structured Neural Networks.
26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023). Valencia, Spain, Apr 25-27, 2023. URL.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[82]
M. Feurer, K. Eggensperger, E. Bergman, F. Pfisterer, B. Bischl and F. Hutter.
Mind the Gap: Measuring Generalization Performance Across Multiple Objectives.
21st International Symposium on Intelligent Data Analysis (IDA 2023). Louvain-la-Neuve, Belgium, Apr 12-14, 2023. DOI.
MCML Authors
Link to Matthias Feurer

Matthias Feurer

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[81]
D. Schalk, B. Bischl and D. Rügamer.
Accelerated Componentwise Gradient Boosting Using Efficient Data Representation and Momentum-Based Optimization.
Journal of Computational and Graphical Statistics 32.2 (Apr. 2023). DOI.
MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[80]
S. Dandl, A. Hofheinz, M. Binder, B. Bischl and G. Casalicchio.
counterfactuals: An R Package for Counterfactual Explanation Methods.
Preprint at arXiv (Apr. 2023). arXiv.
MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[79]
J. Moosbauer, G. Casalicchio, M. Lindauer and B. Bischl.
Improving Accuracy of Interpretability Measures in Hyperparameter Optimization via Bayesian Algorithm Execution.
Workshop on Configuration and Selection of Algorithms (COSEAL 2023). Paris, France, Mar 06-08, 2023. arXiv.
Abstract

Despite all the benefits of automated hyperparameter optimization (HPO), most modern HPO algorithms are black-boxes themselves. This makes it difficult to understand the decision process which leads to the selected configuration, reduces trust in HPO, and thus hinders its broad adoption. Here, we study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots. These techniques are more and more used to explain the marginal effect of hyperparameters on the black-box cost function or to quantify the importance of hyperparameters. However, if such methods are naively applied to the experimental data of the HPO process in a post-hoc manner, the underlying sampling bias of the optimizer can distort interpretations. We propose a modified HPO method which efficiently balances the search for the global optimum w.r.t. predictive performance and the reliable estimation of IML explanations of an underlying black-box function by coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark cases of both synthetic objectives and HPO of a neural network, we demonstrate that our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.

MCML Authors
Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[78]
B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, J. Thomas, T. Ullmann, M. Becker, A.-L. Boulesteix, D. Deng and M. Lindauer.
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges.
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 13.2 (Mar. 2023). DOI.
Abstract

Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time-consuming and irreproducible manual process of trial-and-error to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Tobias Pielok

Tobias Pielok

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability


[77]
D. Schalk, V. Hoffmann, B. Bischl and U. Mansmann.
dsBinVal: Conducting distributed ROC analysis using DataSHIELD.
The Journal of Open Source Software 8.82 (Feb. 2023). DOI.
Abstract

Our R (R Core Team, 2021) package dsBinVal implements the methodology explained by Schalk et al. (2022). It extends the ROC-GLM (Pepe, 2000) to distributed data by using techniques of differential privacy (Dwork et al., 2006) and the idea of sharing highly aggregated values only. The package also exports functionality to calculate distributed calibration curves and assess the calibration. Using the package allows us to evaluate a prognostic model based on a binary outcome using the DataSHIELD (Gaye et al., 2014) framework. Therefore, the main functionality makes it able to 1) compute the receiver operating characteristic (ROC) curve using the ROC-GLM from which 2) the area under the curve (AUC) and confidence intervals (CI) are derived to conduct hypothesis testing according to DeLong et al. (1988). Furthermore, 3) the calibration can be assessed distributively via calibration curves and the Brier score. Visualizing the approximated ROC curve, the AUC with confidence intervals, and the calibration curves using ggplot2 is also supported. Examples can be found in the README file of the repository.

MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[76]
C. Molnar, G. König, B. Bischl and G. Casalicchio.
Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach.
Data Mining and Knowledge Discovery (Jan. 2023). DOI.
Abstract

The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this shift in perspective and in order to enable correct interpretations, it is beneficial if the conditioning is transparent and comprehensible. In this paper, we propose a new sampling mechanism for the conditional distribution based on permutations in conditional subgroups. As these subgroups are constructed using tree-based methods such as transformation trees, the conditioning becomes inherently interpretable. This not only provides a simple and effective estimator of conditional PFI, but also local PFI estimates within the subgroups. In addition, we apply the conditional subgroups approach to partial dependence plots, a popular method for describing feature effects that can also suffer from extrapolation when features are dependent and interactions are present in the model. In simulations and a real-world application, we demonstrate the advantages of the conditional subgroup approach over existing methods: It allows to compute conditional PFI that is more true to the data than existing proposals and enables a fine-grained interpretation of feature effects and importance within the conditional subgroups.

MCML Authors
Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[75]
I. Ziegler, B. Ma, B. Bischl, E. Dorigatti and B. Schubert.
Proteasomal cleavage prediction: state-of-the-art and future directions.
Preprint at bioRxiv (2023). DOI. GitHub.
Abstract

Epitope vaccines are a promising approach for precision treatment of pathogens, cancer, autoimmune diseases, and allergies. Effectively designing such vaccines requires accurate proteasomal cleavage prediction to ensure that the epitopes included in the vaccine trigger an immune response. The performance of proteasomal cleavage predictors has been steadily improving over the past decades owing to increasing data availability and methodological advances. In this review, we summarize the current proteasomal cleavage prediction landscape and, in light of recent progress in the field of deep learning, develop and compare a wide range of recent architectures and techniques, including long short-term memory (LSTM), transformers, and convolutional neural networks (CNN), as well as four different denoising techniques. All open-source cleavage predictors re-trained on our dataset performed within two AUC percentage points. Our comprehensive deep learning architecture benchmark improved performance by 1.7 AUC percentage points, while closed-source predictors performed considerably worse. We found that a wide range of architectures and training regimes all result in very similar performance, suggesting that the specific modeling approach employed has a limited impact on predictive performance compared to the specifics of the dataset employed. We speculate that the noise and implicit nature of data acquisition techniques used for training proteasomal cleavage prediction models and the complexity of biological processes of the antigen processing pathway are the major limiting factors. While biological complexity can be tackled by more data and, to a lesser extent, better models, noise and randomness inherently limit the maximum achievable predictive performance

MCML Authors
Link to Bolei Ma

Bolei Ma

Social Data Science and AI Lab

C4 | Computational Social Sciences

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[74]
J. Goschenhofer, P. Ragupathy, C. Heumann, B. Bischl and M. Aßenmacher.
CC-Top: Constrained Clustering for Dynamic Topic Discovery.
1st Workshop on Ever Evolving NLP (EvoNLP 2022). Abu Dhabi, United Arab Emirates, Dec 07, 2022. URL.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Matthias Aßenmacher

Matthias Aßenmacher

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[73]
M. Rezaei, E. Dorigatti, D. Rügamer and B. Bischl.
Learning Statistical Representation with Joint Deep Embedded Clustering.
IEEE International Conference on Data Mining Workshops (ICDMW 2022). Orlando, FL, USA, Nov 30-Dec 02, 2022. DOI.
MCML Authors
Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[72]
N. Hurmer, X.-Y. To, M. Binder, H. A. Gündüz, P. C. Münch, R. Mreches, A. C. McHardy, B. Bischl and M. Rezaei.
Transformer Model for Genome Sequence Analysis.
Workshop on Learning Meaningful Representations of Life (LMRL 2022) at the 36th Conference on Neural Information Processing Systems (NeurIPS 2022). New Orleans, LA, USA, Nov 28-Dec 09, 2022. URL.
MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Hüseyin Anil Gündüz

Hüseyin Anil Gündüz

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[71]
I. Ziegler, B. Ma, E. Nie, B. Bischl, D. Rügamer, B. Schubert and E. Dorigatti.
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?.
Workshop on Learning Meaningful Representations of Life (LMRL 2022) at the 36th Conference on Neural Information Processing Systems (NeurIPS 2022). New Orleans, LA, USA, Nov 28-Dec 09, 2022. URL.
MCML Authors
Link to Bolei Ma

Bolei Ma

Social Data Science and AI Lab

C4 | Computational Social Sciences

Link to Ercong Nie

Ercong Nie

Statistical NLP and Deep Learning

B2 | Natural Language Processing

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[70]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift.
30th ACM International Conference on Multimedia (MM 2022). Lisbon, Portugal, Oct 10-14, 2022. DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[69]
J. Moosbauer, M. Binder, L. Schneider, F. Pfisterer, M. Becker, M. Lang, L. Kotthoff and B. Bischl.
Automated Benchmark-Driven Design and Explanation of Hyperparameter Optimizers.
IEEE Transactions on Evolutionary Computation 26.6 (Oct. 2022). DOI.
Abstract

Automated hyperparameter optimization (HPO) has gained great popularity and is an important component of most automated machine learning frameworks. However, the process of designing HPO algorithms is still an unsystematic and manual process: new algorithms are often built on top of prior work, where limitations are identified and improvements are proposed. Even though this approach is guided by expert knowledge, it is still somewhat arbitrary. The process rarely allows for gaining a holistic understanding of which algorithmic components drive performance and carries the risk of overlooking good algorithmic design choices. We present a principled approach to automated benchmark-driven algorithm design applied to multifidelity HPO (MF-HPO). First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to, common existing HPO algorithms and then present a configurable framework covering this space. To find the best candidate automatically and systematically, we follow a programming-by-optimization approach and search over the space of algorithm candidates via Bayesian optimization. We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis. We observe that using a relatively simple configuration (in some ways, simpler than established methods) performs very well as long as some critical configuration parameters are set to the right value.

MCML Authors
Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[68]
K. Rath, D. Rügamer, B. Bischl, U. von Toussaint, C. Rea, A. Maris, R. Granetz and C. Albert.
Data augmentation for disruption prediction via robust surrogate models.
Journal of Plasma Physics 88.5 (Oct. 2022). DOI.
Abstract

The goal of this work is to generate large statistically representative data sets to train machine learning models for disruption prediction provided by data from few existing discharges. Such a comprehensive training database is important to achieve satisfying and reliable prediction results in artificial neural network classifiers. Here, we aim for a robust augmentation of the training database for multivariate time series data using Student $t$ process regression. We apply Student $t$ process regression in a state space formulation via Bayesian filtering to tackle challenges imposed by outliers and noise in the training data set and to reduce the computational complexity. Thus, the method can also be used if the time resolution is high. We use an uncorrelated model for each dimension and impose correlations afterwards via colouring transformations. We demonstrate the efficacy of our approach on plasma diagnostics data of three different disruption classes from the DIII-D tokamak. To evaluate if the distribution of the generated data is similar to the training data, we additionally perform statistical analyses using methods from time series analysis, descriptive statistics and classic machine learning clustering algorithms.

MCML Authors
Link to Katharina Rath

Katharina Rath

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[67]
D. Deng, F. Karl, F. Hutter, B. Bischl and M. Lindauer.
Efficient Automated Deep Learning for Time Series Forecasting.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2022). Grenoble, France, Sep 19-22, 2022. DOI.
Abstract

Recent years have witnessed tremendously improved efficiency of Automated Machine Learning (AutoML), especially Automated Deep Learning (AutoDL) systems, but recent work focuses on tabular, image, or NLP tasks. So far, little attention has been paid to general AutoDL frameworks for time series forecasting, despite the enormous success in applying different novel architectures to such tasks. In this paper, we propose an efficient approach for the joint optimization of neural architecture and hyperparameters of the entire data processing pipeline for time series forecasting. In contrast to common NAS search spaces, we designed a novel neural architecture search space covering various state-of-the-art architectures, allowing for an efficient macro-search over different DL approaches. To efficiently search in such a large configuration space, we use Bayesian optimization with multi-fidelity optimization. We empirically study several different budget types enabling efficient multi-fidelity optimization on different forecasting datasets. Furthermore, we compared our resulting system, against several established baselines and show that it significantly outperforms all of them across several datasets.

MCML Authors
Link to Florian Karl

Florian Karl

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[66]
D. Rügamer, A. Bender, S. Wiegrebe, D. Racek, B. Bischl, C. L. Müller and C. Stachl.
Factorized Structured Regression for Large-Scale Varying Coefficient Models.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2022). Grenoble, France, Sep 19-22, 2022. DOI.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Müller

Christian Müller

Prof. Dr.

Biomedical Statistics and Data Science

C2 | Biology


[65]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Implicit Embeddings via GAN Inversion for High Resolution Chest Radiographs.
1st Workshop on Medical Applications with Disentanglements (MAD 2022) at the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). Singapore, Sep 18-22, 2022. DOI.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[64]
E. Dorigatti, B. Bischl and B. Schubert.
Improved proteasomal cleavage prediction with positive-unlabeled learning.
Preprint at arXiv (Sep. 2022). arXiv.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[63]
E. Dorigatti, J. Schweisthal, B. Bischl and M. Rezaei.
Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision.
Preprint at arXiv (Sep. 2022). arXiv.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Jonas Schweisthal

Jonas Schweisthal

Artificial Intelligence in Management

C4 | Computational Social Sciences

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[62]
S.-F. Zheng, J. E. Nam, E. Dorigatti, B. Bischl, S. Azizi and M. Rezaei.
Joint Debiased Representation and Image Clustering Learning with Self-Supervision.
Preprint at arXiv (Sep. 2022). arXiv.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[61]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Representation Learning for Tablet and Paper Domain Adaptation in favor of Online Handwriting Recognition.
7th International Workshop on Multimodal pattern recognition of social signals in human computer interaction (MPRSS 2022) at the 26th International Conference on Pattern Recognition (ICPR 2022). Montreal, Canada, Aug 21-25, 2022. arXiv.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[60]
F. Ott, N. L. Raichur, D. Rügamer, T. Feigl, H. Neumann, B. Bischl and C. Mutschler.
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression.
Preprint at arXiv (Aug. 2022). arXiv.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[59]
L. Schneider, L. Schäpermeier, R. P. Prager, B. Bischl, H. Trautmann and P. Kerschke.
HPO X ELA: Investigating Hyperparameter Optimization Landscapes by Means of Exploratory Landscape Analysis.
Preprint at arXiv (Aug. 2022). arXiv.
MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[58]
F. Pfisterer, L. Schneider, J. Moosbauer, M. Binder and B. Bischl.
YAHPO Gym - Design Criteria and a new Multifidelity Benchmark for Hyperparameter Optimization.
1st International Conference on Automated Machine Learning (AutoML 2022) co-located with the 39th International Conference on Machine Learning (ICML 2022). Baltimore, MD, USA, Jul 25-27, 2022. URL. GitHub.
Abstract

When developing and analyzing new hyperparameter optimization (HPO) methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we list desirable properties and requirements for such benchmarks and propose a new set of challenging and relevant multifidelity HPO benchmark problems motivated by these requirements. For this, we revisit the concept of surrogate-based benchmarks and empirically compare them to more widely-used tabular benchmarks, showing that the latter ones may induce bias in performance estimation and ranking of HPO methods. We present a new surrogate-based benchmark suite for multifidelity HPO methods consisting of 9 benchmark collections that constitute over 700 multifidelity HPO problems in total. All our benchmarks also allow for querying of multiple optimization targets, enabling the benchmarking of multi-objective HPO. We examine and compare our benchmark suite with respect to the defined requirements and show that our benchmarks provide viable additions to existing suites.

MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[57]
L. Schneider, F. Pfisterer, P. Kent, J. Branke, B. Bischl and J. Thomas.
Tackling neural architecture search with quality diversity optimization.
1st International Conference on Automated Machine Learning (AutoML 2022) co-located with the 39th International Conference on Machine Learning (ICML 2022). Baltimore, MD, USA, Jul 25-27, 2022. URL.
Abstract

Neural architecture search (NAS) has been studied extensively and has grown to become a research field with substantial impact. While classical single-objective NAS searches for the architecture with the best performance, multi-objective NAS considers multiple objectives that should be optimized simultaneously, e.g., minimizing resource usage along the validation error. Although considerable progress has been made in the field of multi-objective NAS, we argue that there is some discrepancy between the actual optimization problem of practical interest and the optimization problem that multi-objective NAS tries to solve. We resolve this discrepancy by formulating the multi-objective NAS problem as a quality diversity optimization (QDO) problem and introduce three quality diversity NAS optimizers (two of them belonging to the group of multifidelity optimizers), which search for high-performing yet diverse architectures that are optimal for application-specific niches, e.g., hardware constraints. By comparing these optimizers to their multi-objective counterparts, we demonstrate that quality diversity NAS in general outperforms multi-objective NAS with respect to quality of solutions and efficiency. We further show how applications and future NAS research can thrive on QDO.

MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[56]
A. Klaß, S. M. Lorenz, M. W. Lauer-Schmaltz, D. Rügamer, B. Bischl, C. Mutschler and F. Ott.
Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift.
Workshop on Spatio-Temporal Reasoning and Learning (STRL 2022) at the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence (IJCAI-ECAI 2022). Vienna, Austria, Jul 23-29, 2022. URL.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[55]
A. Khakzar, Y. Li, Y. Zhang, M. Sanisoglu, S. T. Kim, M. Rezaei, B. Bischl and N. Navab.
Analyzing the Effects of Handling Data Imbalance on Learned Features from Medical Images by Looking Into the Models.
2nd Workshop on Interpretable Machine Learning in Healthcare (IMLH 2022) at the the 39th International Conference on Machine Learning (ICML 2022). Baltimore, MD, USA, Jul 17-23, 2022. arXiv.
MCML Authors
Link to Ashkan Khakzar

Ashkan Khakzar

Dr.

* Former member

C1 | Medicine

Link to Yawei Li

Yawei Li

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Nassir Navab

Nassir Navab

Prof. Dr.

Computer Aided Medical Procedures & Augmented Reality

C1 | Medicine


[54]
S. Dandl, F. Pfisterer and B. Bischl.
Multi-Objective Counterfactual Fairness.
Genetic and Evolutionary Computation Conference (GECCO 2022). Boston, MA, USA, Jul 09-13, 2022. DOI.
MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[53]
L. Schneider, F. Pfisterer, J. Thomas and B. Bischl.
A Collection of Quality Diversity Optimization Problems Derived from Hyperparameter Optimization of Machine Learning Models.
Genetic and Evolutionary Computation Conference (GECCO 2022). Boston, MA, USA, Jul 09-13, 2022. DOI.
MCML Authors
Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[52]
Q. Au, J. Herbinger, C. Stachl, B. Bischl and G. Casalicchio.
Grouped Feature Importance and Combined Features Effect Plot.
Data Mining and Knowledge Discovery 36 (Jun. 2022). DOI.
Abstract

Interpretable machine learning has become a very active area of research due to the rising popularity of machine learning algorithms and their inherently challenging interpretability. Most work in this area has been focused on the interpretation of single features in a model. However, for researchers and practitioners, it is often equally important to quantify the importance or visualize the effect of feature groups. To address this research gap, we provide a comprehensive overview of how existing model-agnostic techniques can be defined for feature groups to assess the grouped feature importance, focusing on permutation-based, refitting, and Shapley-based methods. We also introduce an importance-based sequential procedure that identifies a stable and well-performing combination of features in the grouped feature space. Furthermore, we introduce the combined features effect plot, which is a technique to visualize the effect of a group of features based on a sparse, interpretable linear combination of features. We used simulation studies and real data examples to analyze, compare, and discuss these methods.

MCML Authors
Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[51]
J. Moosbauer, G. Casalicchio, M. Lindauer and B. Bischl.
Enhancing Explainability of Hyperparameter Optimization via Bayesian Algorithm Execution.
Preprint at arXiv (Jun. 2022). arXiv.
Abstract

Despite all the benefits of automated hyperparameter optimization (HPO), most modern HPO algorithms are black-boxes themselves. This makes it difficult to understand the decision process which leads to the selected configuration, reduces trust in HPO, and thus hinders its broad adoption. Here, we study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots. These techniques are more and more used to explain the marginal effect of hyperparameters on the black-box cost function or to quantify the importance of hyperparameters. However, if such methods are naively applied to the experimental data of the HPO process in a post-hoc manner, the underlying sampling bias of the optimizer can distort interpretations. We propose a modified HPO method which efficiently balances the search for the global optimum w.r.t. predictive performance emph{and} the reliable estimation of IML explanations of an underlying black-box function by coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark cases of both synthetic objectives and HPO of a neural network, we demonstrate that our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.

MCML Authors
Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[50]
P. Kopper, S. Wiegrebe, B. Bischl, A. Bender and D. Rügamer.
DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis.
26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2022). Chengdu, China, May 16-19, 2022. DOI.
Abstract

Survival analysis (SA) is an active field of research that is concerned with time-to-event outcomes and is prevalent in many domains, particularly biomedical applications. Despite its importance, SA remains challenging due to small-scale data sets and complex outcome distributions, concealed by truncation and censoring processes. The piecewise exponential additive mixed model (PAMM) is a model class addressing many of these challenges, yet PAMMs are not applicable in high-dimensional feature settings or in the case of unstructured or multimodal data. We unify existing approaches by proposing DeepPAMM, a versatile deep learning framework that is well-founded from a statistical point of view, yet with enough flexibility for modeling complex hazard structures. We illustrate that DeepPAMM is competitive with other machine learning approaches with respect to predictive performance while maintaining interpretability through benchmark experiments and an extended case study.

MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[49]
L. Bothmann, K. Peters and B. Bischl.
What Is Fairness? Implications For FairML.
Preprint at arXiv (May. 2022). arXiv.
MCML Authors
Link to Ludwig Bothmann

Ludwig Bothmann

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[48]
J. Herbinger, B. Bischl and G. Casalicchio.
REPID: Regional Effect Plots with implicit Interaction Detection.
25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022). Virtual, Mar 28-30, 2022. URL.
Abstract

Machine learning models can automatically learn complex relationships, such as non-linear and interaction effects. Interpretable machine learning methods such as partial dependence plots visualize marginal feature effects but may lead to misleading interpretations when feature interactions are present. Hence, employing additional methods that can detect and measure the strength of interactions is paramount to better understand the inner workings of machine learning models. We demonstrate several drawbacks of existing global interaction detection approaches, characterize them theoretically, and evaluate them empirically. Furthermore, we introduce regional effect plots with implicit interaction detection, a novel framework to detect interactions between a feature of interest and other features. The framework also quantifies the strength of interactions and provides interpretable and distinct regions in which feature effects can be interpreted more reliably, as they are less confounded by interactions. We prove the theoretical eligibility of our method and show its applicability on various simulation and real-world examples.

MCML Authors
Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[47]
F. Pargent, F. Pfisterer, J. Thomas and B. Bischl.
Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features.
Computational Statistics 37 (Mar. 2022). DOI.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[46]
F. Ott, D. Rügamer, L. Heublein, B. Bischl and C. Mutschler.
Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022). Waikoloa, Hawaii, Jan 04-08, 2022. DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[45]
F. Ott, D. Rügamer, L. Heublein, T. Hamann, J. Barth, B. Bischl and C. Mutschler.
Benchmarking online sequence-to-sequence and character-based handwriting recognition from IMU-enhanced pens.
International Journal on Document Analysis and Recognition 25.4 (2022). DOI.
MCML Authors
Link to Felix Ott

Felix Ott

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[44]
E. Dorigatti, J. Goschenhofer, B. Schubert, M. Rezaei and B. Bischl.
Positive-Unlabeled Learning with Uncertainty-aware Pseudo-label Selection.
Preprint at arXiv (Jan. 2022). arXiv.
MCML Authors
Link to Emilio Dorigatti

Emilio Dorigatti

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[43]
C. A. Scholbeck, G. Casalicchio, C. Molnar, B. Bischl and C. Heumann.
Marginal Effects for Non-Linear Prediction Functions.
Under review (Jan. 2022). arXiv.
Abstract

Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a direct feature effect on the predicted outcome. Hence, marginal effects are typically used as approximations for feature effects, either in the shape of derivatives of the prediction function or forward differences in prediction due to a change in a feature value. While marginal effects are commonly used in many scientific fields, they have not yet been adopted as a model-agnostic interpretation method for machine learning models. This may stem from their inflexibility as a univariate feature effect and their inability to deal with the non-linearities found in black box models. We introduce a new class of marginal effects termed forward marginal effects. We argue to abandon derivatives in favor of better-interpretable forward differences. Furthermore, we generalize marginal effects based on forward differences to multivariate changes in feature values. To account for the non-linearity of prediction functions, we introduce a non-linearity measure for marginal effects. We argue against summarizing feature effects of a non-linear prediction function in a single metric such as the average marginal effect. Instead, we propose to partition the feature space to compute conditional average marginal effects on feature subspaces, which serve as conditional feature effect estimates.

MCML Authors
Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[42]
J. Moosbauer, J. Herbinger, G. Casalicchio, M. Lindauer and B. Bischl.
Explaining Hyperparameter Optimization via Partial Dependence Plots.
35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. URL. GitHub.
Abstract

Automated hyperparameter optimization (HPO) can support practitioners to obtain peak performance in machine learning models. However, there is often a lack of valuable insights into the effects of different hyperparameters on the final model performance. This lack of explainability makes it difficult to trust and understand the automated HPO process and its results. We suggest using interpretable machine learning (IML) to gain insights from the experimental data obtained during HPO with Bayesian optimization (BO). BO tends to focus on promising regions with potential high-performance configurations and thus induces a sampling bias. Hence, many IML techniques, such as the partial dependence plot (PDP), carry the risk of generating biased interpretations. By leveraging the posterior uncertainty of the BO surrogate model, we introduce a variant of the PDP with estimated confidence bands. We propose to partition the hyperparameter space to obtain more confident and reliable PDPs in relevant sub-regions. In an experimental study, we provide quantitative evidence for the increased quality of the PDPs within sub-regions.

MCML Authors
Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[41]
T. Weber, M. Ingrisch, M. Fabritius, B. Bischl and D. Rügamer.
Survival-oriented embeddings for improving accessibility to complex data structures.
Workshop on Bridging the Gap: from Machine Learning Research to Clinical Practice at the 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. arXiv.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[40]
T. Weber, M. Ingrisch, B. Bischl and D. Rügamer.
Towards modelling hazard factors in unstructured data spaces using gradient-based latent interpolation.
Workshop on Deep Generative Models and Downstream Applications at the 35th Conference on Neural Information Processing Systems (NeurIPS 2021). Virtual, Dec 06-14, 2021. PDF.
MCML Authors
Link to Tobias Weber

Tobias Weber

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Michael Ingrisch

Michael Ingrisch

Prof. Dr.

Clinical Data Science in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[39]
S. Coors, D. Schalk, B. Bischl and D. Rügamer.
Automatic Componentwise Boosting: An Interpretable AutoML System.
Automating Data Science Workshop (ADS 2021) at the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD 2021). Virtual, Sep 13-17, 2021. arXiv.
Abstract

In practice, machine learning (ML) workflows require various different steps, from data preprocessing, missing value imputation, model selection, to model tuning as well as model evaluation. Many of these steps rely on human ML experts. AutoML - the field of automating these ML pipelines - tries to help practitioners to apply ML off-the-shelf without any expert knowledge. Most modern AutoML systems like auto-sklearn, H20-AutoML or TPOT aim for high predictive performance, thereby generating ensembles that consist almost exclusively of black-box models. This, in turn, makes the interpretation for the layperson more intricate and adds another layer of opacity for users. We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm. Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions, allows for a straightforward calculation of feature importance, and gives insights into the required model complexity to fit the given task. We introduce the general framework and outline its implementation autocompboost. To demonstrate the frameworks efficacy, we compare autocompboost to other existing systems based on the OpenML AutoML-Benchmark. Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets while being more user-friendly and transparent.

MCML Authors
Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[38]
R. Sonabend, F. J. Király, A. Bender, B. Bischl and M. Lang.
mlr3proba: An R Package for Machine Learning in Survival Analysis.
Bioinformatics 37.17 (Sep. 2021). DOI.
MCML Authors
Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[37]
F. Soleymani, M. Eslami, T. Elze, B. Bischl and M. Rezaei.
Deep Variational Clustering Framework for Self-labeling of Large-scale Medical Images.
Preprint at arXiv (Sep. 2021). arXiv.
MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Mina Rezaei

Mina Rezaei

Dr.

Statistical Learning & Data Science

Education Coordination

A1 | Statistical Foundations & Explainability


[36]
F. Pfisterer, C. Kern, S. Dandl, M. Sun, M. P. Kim and B. Bischl.
mcboost: Multi-Calibration Boosting for R.
The Journal of Open Source Software 6.64 (Aug. 2021). DOI.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Christoph Kern

Christoph Kern

Prof. Dr.

Social Data Science and AI Lab

C4 | Computational Social Sciences

Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[35]
P. Gijsbers, F. Pfisterer, J. van Rijn, B. Bischl and J. Vanschoren.
Meta-Learning for Symbolic Hyperparameter Defaults.
Genetic and Evolutionary Computation Conference (GECCO 2021). Lile, France, Jul 10-14, 2021. DOI.
Abstract

Hyperparameter optimization in machine learning (ML) deals with the problem of empirically learning an optimal algorithm configuration from data, usually formulated as a black-box optimization problem. In this work, we propose a zero-shot method to meta-learn symbolic default hyperparameter configurations that are expressed in terms of the properties of the dataset. This enables a much faster, but still data-dependent, configuration of the ML algorithm, compared to standard hyperparameter optimization approaches. In the past, symbolic and static default values have usually been obtained as hand-crafted heuristics. We propose an approach of learning such symbolic configurations as formulas of dataset properties from a large set of prior evaluations on multiple datasets by optimizing over a grammar of expressions using an evolutionary algorithm. We evaluate our method on surrogate empirical performance models as well as on real data across 6 ML algorithms on more than 100 datasets and demonstrate that our method indeed finds viable symbolic defaults.

MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[34]
F. Pfisterer, J. van Rijn, P. Probst, A. Müller and B. Bischl.
Learning Multiple Defaults for Machine Learning Algorithms.
Genetic and Evolutionary Computation Conference (GECCO 2021). Lile, France, Jul 10-14, 2021. DOI.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[33]
M. Binder, F. Pfisterer, M. Lang, L. Schneider, L. Kotthoff and B. Bischl.
mlr3pipelines - Flexible Machine Learning Pipelines in R.
Journal of Machine Learning Research 22.184 (Jun. 2021). URL.
MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[32]
P. Kopper, S. Pölsterl, C. Wachinger, B. Bischl, A. Bender and D. Rügamer.
Semi-Structured Deep Piecewise Exponential Models.
AAAI Spring Symposium Series on Survival Prediction: Algorithms, Challenges and Applications (AAAI-SPACA 2021). Palo Alto, California, USA, Mar 21-24, 2021. PDF.
Abstract

We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning. The presented framework is based on piecewise expo-nential models and thereby supports various survival tasks, such as competing risks and multi-state modeling, and further allows for estimation of time-varying effects and time-varying features. To also include multiple data sources and higher-order interaction effects into the model, we embed the model class in a neural network and thereby enable the si-multaneous estimation of both inherently interpretable structured regression inputs as well as deep neural network components which can potentially process additional unstructured data sources. A proof of concept is provided by using the framework to predict Alzheimer’s disease progression based on tabular and 3D point cloud data and applying it to synthetic data.

MCML Authors
Link to Christian Wachinger

Christian Wachinger

Prof. Dr.

Artificial Intelligence in Radiology

C1 | Medicine

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability


[31]
J. Goschenhofer, R. Hvingelby, D. Rügamer, J. Thomas, M. Wagner and B. Bischl.
Deep Semi-Supervised Learning for Time Series Classification.
Preprint at arXiv (Feb. 2021). arXiv.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[30]
G. König, C. Molnar, B. Bischl and M. Grosse-Wentrup.
Relative Feature Importance.
25th International Conference on Pattern Recognition (ICPR 2020). Virtual - Milano, Italy, Jan 10-15, 2021. DOI.
Abstract

Interpretable Machine Learning (IML) methods are used to gain insight into the relevance of a feature of interest for the performance of a model. Commonly used IML methods differ in whether they consider features of interest in isolation, e.g., Permutation Feature Importance (PFI), or in relation to all remaining feature variables, e.g., Conditional Feature Importance (CFI). As such, the perturbation mechanisms inherent to PFI and CFI represent extreme reference points. We introduce Relative Feature Importance (RFI), a generalization of PFI and CFI that allows for a more nuanced feature importance computation beyond the PFI versus CFI dichotomy. With RFI, the importance of a feature relative to any other subset of features can be assessed, including variables that were not available at training time. We derive general interpretation rules for RFI based on a detailed theoretical analysis of the implications of relative feature relevance, and demonstrate the method's usefulness on simulated examples.

MCML Authors
Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Moritz Grosse-Wentrup

Moritz Grosse-Wentrup

Prof. Dr.

* Former member

A1 | Statistical Foundations & Explainability


[29]
M. Becker, S. Gruber, J. Richter, J. Moosbauer and B. Bischl.
mlr3hyperband: Hyperband for 'mlr3'.
2021. URL. GitHub.
MCML Authors
Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[28]
M. Becker, M. Lang, J. Richter, B. Bischl and D. Schalk.
mlr3tuning: Tuning for 'mlr3'.
2021. URL. GitHub.
MCML Authors
Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[27]
M. Becker, J. Richter, M. Lang, B. Bischl and M. Binder.
bbotk: Black-Box Optimization Toolkit.
2021. URL. GitHub.
MCML Authors
Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability


[26]
M. Lang, B. Bischl, J. Richter, X. Sun and M. Binder.
paradox: Define and Work with Parameter Spaces for Complex Algorithms.
2021. URL. GitHub.
MCML Authors
Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability


[25]
I. Gerostathopoulos, F. Plášil, C. Prehofer, J. Thomas and B. Bischl.
Automated Online Experiment-Driven Adaptation--Mechanics and Cost Aspects.
IEEE Access 9 (2021). DOI.
MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[24]
A. Agrawal, F. Pfisterer, B. Bischl, J. Chen, S. Sood, S. Shah, F. Buet-Golfouse, B. A. Mateen and S. Vollmer.
Debiasing classifiers: is reality at variance with expectation?.
Preprint at arXiv (Nov. 2020). arXiv.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[23]
D. Rügamer, F. Pfisterer and B. Bischl.
Neural Mixture Distributional Regression.
Preprint at arXiv (Oct. 2020). arXiv.
MCML Authors
Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[22]
A. Bender, D. Rügamer, F. Scheipl and B. Bischl.
A General Machine Learning Framework for Survival Analysis.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2020). Virtual, Sep 14-18, 2020. DOI.
MCML Authors
Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to David Rügamer

David Rügamer

Prof. Dr.

Data Science Group

A1 | Statistical Foundations & Explainability

Link to Fabian Scheipl

Fabian Scheipl

PD Dr.

Functional Data Analysis

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[21]
C. Molnar, G. Casalicchio and B. Bischl.
Interpretable Machine Learning -- A Brief History, State-of-the-Art and Challenges.
Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (Workshops ECML-PKDD 2020). Virtual, Sep 14-18, 2020. DOI.
MCML Authors
Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[20]
S. Dandl, C. Molnar, M. Binder and B. Bischl.
Multi-Objective Counterfactual Explanations.
16th International Conference on Parallel Problem Solving from Nature (PPSN 2020). Leiden, Netherlands, Sep 05-09, 2020. DOI.
MCML Authors
Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[19]
R. Sonabend, F. J. Király, A. Bender, B. Bischl and M. Lang.
mlr3proba: Machine Learning Survival Analysis in R.
Preprint at arXiv (Aug. 2020). arXiv.
MCML Authors
Link to Andreas Bender

Andreas Bender

Dr.

Statistical Learning & Data Science

Coordinator Statistical and Machine Learning Consulting

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability


[18]
M. Binder, F. Pfisterer and B. Bischl.
Collecting empirical data about hyperparameters for data driven AutoML.
7th Workshop on Automated Machine Learning (AutoML 2020) co-located with ICML 2020. Virtual, Jul 18, 2020. PDF.
Abstract

All optimization needs some kind of prior over the functions it is optimizing over. We used a large computing cluster to collect empirical data about the behavior of ML performance, by randomly sampling hyperparameter values and performing cross-validation. We also collected information about cross-validation error by performing some evaluations multiple times, and information about progression of performance with respect to training data size by performing some evaluations on data subsets. We present how we collected data, make some preliminary analyses on the surrogate models that can be built with them, and give an outlook over interesting analyses this should enable.

MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[17]
C. Molnar, G. König, J. Herbinger, T. Freiesleben, S. Dandl, C. A. Scholbeck, G. Casalicchio, M. Grosse-Wentrup and B. Bischl.
General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models.
Workshop on Extending Explainable AI Beyond Deep Models and Classifiers (XXAI 2020) at the 37th International Conference on Machine Learning (ICML 2020). Virtual, Jul 12-18, 2020. DOI.
MCML Authors
Link to Gunnar König

Gunnar König

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Julia Herbinger

Julia Herbinger

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Moritz Grosse-Wentrup

Moritz Grosse-Wentrup

Prof. Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[16]
M. Binder, J. Moosbauer, J. Thomas and B. Bischl.
Multi-Objective Hyperparameter Tuning and Feature Selection Using Filter Ensembles.
Genetic and Evolutionary Computation Conference (GECCO 2020). Cancun, Mexico, Jul 08-12, 2020. DOI.
Abstract

Both feature selection and hyperparameter tuning are key tasks in machine learning. Hyperparameter tuning is often useful to increase model performance, while feature selection is undertaken to attain sparse models. Sparsity may yield better model interpretability and lower cost of data acquisition, data handling and model inference. While sparsity may have a beneficial or detrimental effect on predictive performance, a small drop in performance may be acceptable in return for a substantial gain in sparseness. We therefore treat feature selection as a multi-objective optimization task. We perform hyperparameter tuning and feature selection simultaneously because the choice of features of a model may influence what hyperparameters perform well. We present, benchmark, and compare two different approaches for multi-objective joint hyperparameter optimization and feature selection: The first uses multi-objective model-based optimization. The second is an evolutionary NSGA-II-based wrapper approach to feature selection which incorporates specialized sampling, mutation and recombination operators. Both methods make use of parameterized filter ensembles. While model-based optimization needs fewer objective evaluations to achieve good performance, it incurs computational overhead compared to the NSGA-II, so the preferred choice depends on the cost of evaluating a model on given data.

MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[15]
N. Ellenbach, A.-L. Boulesteix, B. Bischl, K. Unger and R. Hornung.
Improved outcome prediction across data sources through robust parameter tuning.
Journal of Classification (Jul. 2020). DOI.
Abstract

In many application areas, prediction rules trained based on high-dimensional data are subsequently applied to make predictions for observations from other sources, but they do not always perform well in this setting. This is because data sets from different sources can feature (slightly) differing distributions, even if they come from similar populations. In the context of high-dimensional data and beyond, most prediction methods involve one or several tuning parameters. Their values are commonly chosen by maximizing the cross-validated prediction performance on the training data. This procedure, however, implicitly presumes that the data to which the prediction rule will be ultimately applied, follow the same distribution as the training data. If this is not the case, less complex prediction rules that slightly underfit the training data may be preferable. Indeed, a tuning parameter does not only control the degree of adjustment of a prediction rule to the training data, but also, more generally, the degree of adjustment to the distribution of the training data. On the basis of this idea, in this paper we compare various approaches including new procedures for choosing tuning parameter values that lead to better generalizing prediction rules than those obtained based on cross-validation. Most of these approaches use an external validation data set. In our extensive comparison study based on a large collection of 15 transcriptomic data sets, tuning on external data and robust tuning with a tuned robustness parameter are the two approaches leading to better generalizing prediction rules.

MCML Authors
Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Roman Hornung

Roman Hornung

Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability


[14]
M. Becker, P. Schratz, M. Lang and B. Bischl.
mlr3fselect: Feature Selection for 'mlr3'.
2020. URL.
MCML Authors
Link to Marc Becker

Marc Becker

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[13]
M. Binder, F. Pfisterer, L. Schneider, B. Bischl, M. Lang and S. Dandl.
mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3'.
2020. URL. GitHub.
MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Lennart Schneider

Lennart Schneider

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Susanne Dandl

Susanne Dandl

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[12]
P. Schratz, M. Lang, B. Bischl and M. Binder.
mlr3filters: Filter Based Feature Selection for 'mlr3'.
2020. URL. GitHub.
MCML Authors
Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability


[11]
M. Binder, J. Moosbauer, J. Thomas and B. Bischl.
Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles.
Preprint at arXiv (Dec. 2019). arXiv.
MCML Authors
Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Julia Moosbauer

Julia Moosbauer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[10]
M. Lang, M. Binder, J. Richter, P. Schratz, F. Pfisterer, S. Coors, Q. A. Q. A., G. Casalicchio, L. Kotthoff and B. Bischl.
mlr3: A modern object-oriented machine learning framework in R.
The Journal of Open Source Software 4.44 (Dec. 2019). DOI.
MCML Authors
Michel Lang

Michel Lang

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Martin Binder

Martin Binder

Statistical Learning & Data Science

Coordinator for Open Source & Open Data

A1 | Statistical Foundations & Explainability

Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[9]
F. Pfisterer, L. Beggel, X. Sun, F. Scheipl and B. Bischl.
Benchmarking time series classification -- Functional data vs machine learning approaches.
Preprint at arXiv (Nov. 2019). arXiv.
Abstract

Time series classification problems have drawn increasing attention in the machine learning and statistical community. Closely related is the field of functional data analysis (FDA): it refers to the range of problems that deal with the analysis of data that is continuously indexed over some domain. While often employing different methods, both fields strive to answer similar questions, a common example being classification or regression problems with functional covariates. We study methods from functional data analysis, such as functional generalized additive models, as well as functionality to concatenate (functional-) feature extraction or basis representations with traditional machine learning algorithms like support vector machines or classification trees. In order to assess the methods and implementations, we run a benchmark on a wide variety of representative (time series) data sets, with in-depth analysis of empirical results, and strive to provide a reference ranking for which method(s) to use for non-expert practitioners. Additionally, we provide a software framework in R for functional data analysis for supervised learning, including machine learning and more linear approaches from statistics. This allows convenient access, and in connection with the machine-learning toolbox mlr, those methods can now also be tuned and benchmarked.

MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Fabian Scheipl

Fabian Scheipl

PD Dr.

Functional Data Analysis

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[8]
F. Pfisterer, J. Thomas and B. Bischl.
Towards Human Centered AutoML.
Preprint at arXiv (Nov. 2019). arXiv.
MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[7]
L. Beggel, M. Pfeiffer and B. Bischl.
Robust Anomaly Detection in Images Using Adversarial Autoencoders.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2019). Wuerzburg, Germany, Sep 16-20, 2019. DOI.
MCML Authors
Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[6]
J. Goschenhofer, F. M. J. Pfister, K. A. Yuksel, B. Bischl, U. Fietzek and J. Thomas.
Wearable-based Parkinson's Disease Severity Monitoring using Deep Learning.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2019). Wuerzburg, Germany, Sep 16-20, 2019. DOI.
MCML Authors
Link to Jann Goschenhofer

Jann Goschenhofer

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[5]
C. Molnar, G. Casalicchio and B. Bischl.
Quantifying Model Complexity via Functional Decomposition for Better Post-hoc Interpretability.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2019). Wuerzburg, Germany, Sep 16-20, 2019. DOI.
Abstract

Post-hoc model-agnostic interpretation methods such as partial dependence plots can be employed to interpret complex machine learning models. While these interpretation methods can be applied regardless of model complexity, they can produce misleading and verbose results if the model is too complex, especially w.r.t. feature interactions. To quantify the complexity of arbitrary machine learning models, we propose model-agnostic complexity measures based on functional decomposition: number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures is more reliable and compact. Furthermore, we demonstrate the application of these measures in a multi-objective optimization approach which simultaneously minimizes loss and complexity.

MCML Authors
Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[4]
C. A. Scholbeck, C. Molnar, C. Heumann, B. Bischl and G. Casalicchio.
Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model Agnostic Interpretations.
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2019). Wuerzburg, Germany, Sep 16-20, 2019. DOI.
MCML Authors
Link to Christian Scholbeck

Christian Scholbeck

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[3]
F. Pfisterer, S. Coors, J. Thomas and B. Bischl.
Multi-Objective Automatic Machine Learning with AutoxgboostMC.
Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (Workshops ECML-PKDD 2019). Wuerzburg, Germany, Sep 16-20, 2019. arXiv.
Abstract

AutoML systems are currently rising in popularity, as they can build powerful models without human oversight. They often combine techniques from many different sub-fields of machine learning in order to find a model or set of models that optimize a user-supplied criterion, such as predictive performance. The ultimate goal of such systems is to reduce the amount of time spent on menial tasks, or tasks that can be solved better by algorithms while leaving decisions that require human intelligence to the end-user. In recent years, the importance of other criteria, such as fairness and interpretability, and many others have become more and more apparent. Current AutoML frameworks either do not allow to optimize such secondary criteria or only do so by limiting the system's choice of models and preprocessing steps. We propose to optimize additional criteria defined by the user directly to guide the search towards an optimal machine learning pipeline. In order to demonstrate the need and usefulness of our approach, we provide a simple multi-criteria AutoML system and showcase an exemplary application.

MCML Authors
Link to Florian Pfisterer

Florian Pfisterer

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Stefan Coors

Stefan Coors

* Former member

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[2]
Q. Au, D. Schalk, G. Casalicchio, R. Schoedel, C. Stachl and B. Bischl.
Component-Wise Boosting of Targets for Multi-Output Prediction.
Preprint at arXiv (Apr. 2019). arXiv.
MCML Authors
Link to Daniel Schalk

Daniel Schalk

Dr.

* Former member

A1 | Statistical Foundations & Explainability

Link to Giuseppe Casalicchio

Giuseppe Casalicchio

Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability


[1]
P. Probst, A.-L. Boulesteix and B. Bischl.
Tunability: Importance of Hyperparameters of Machine Learning Algorithms.
Journal of Machine Learning Research 20 (Mar. 2019). PDF.
MCML Authors
Link to Anne-Laure Boulesteix

Anne-Laure Boulesteix

Prof. Dr.

Biometry in Molecular Medicine

A1 | Statistical Foundations & Explainability

Link to Bernd Bischl

Bernd Bischl

Prof. Dr.

Statistical Learning & Data Science

A1 | Statistical Foundations & Explainability