04.09.2024

MCML Researchers With Ten Papers at ECML-PKDD 2024

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML-PKDD 2024). Vilnius, Lithuania, 09.09.2024–13.09.2024

We are happy to announce that MCML researchers are represented with ten papers at ECML-PKDD 2024. Congrats to our researchers!

Main Track (8 papers)

H. Baniecki, G. Casalicchio, B. Bischl and P. Biecek.
On the Robustness of Global Feature Effect Explanations.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bounds for evaluating the robustness of partial dependence plots and accumulated local effects. Our experimental results with synthetic and real-world datasets quantify the gap between the best and worst-case scenarios of (mis)interpreting machine learning predictions globally.

MCML Authors

Giuseppe Casalicchio

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

Bernd Bischl

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistical Learning and Data Science

C. Damke and E. Hüllermeier.
CUQ-GNN: Committee-Based Graph Uncertainty Quantification Using Posterior Networks.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

In this work, we study the influence of domain-specific characteristics when defining a meaningful notion of predictive uncertainty on graph data. Previously, the so-called Graph Posterior Network (GPN) model has been proposed to quantify uncertainty in node classification tasks. Given a graph, it uses Normalizing Flows (NFs) to estimate class densities for each node independently and converts those densities into Dirichlet pseudo-counts, which are then dispersed through the graph using the personalized Page-Rank (PPR) algorithm. The architecture of GPNs is motivated by a set of three axioms on the properties of its uncertainty estimates. We show that those axioms are not always satisfied in practice and therefore propose the family of Committe-based Uncertainty Quantification Graph Neural Networks (CUQ-GNNs), which combine standard Graph Neural Networks (GNNs) with the NF-based uncertainty estimation of Posterior Networks (PostNets). This approach adapts more flexibly to domain-specific demands on the properties of uncertainty estimates. We compare CUQ-GNN against GPN and other uncertainty quantification approaches on common node classification benchmarks and show that it is effective at producing useful uncertainty estimates.

MCML Authors

Clemens Damke

A3 | Computational Models
→ Group Eyke Hüllermeier

Artificial Intelligence and Machine Learning

Eyke Hüllermeier

Prof. Dr.

A3 | Computational Models

Artificial Intelligence and Machine Learning

R. Fischer, M. Wever, S. Buschjäger and T. Liebig.
MetaQuRe: Meta-learning from Model Quality and Resource Consumption.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

Automated machine learning (AutoML) allows for selecting, parametrizing, and composing learning algorithms for a given data set. While resources play a pivotal role in neural architecture search, it is less pronounced by classical AutoML approaches. In fact, they generally focus on only maximizing predictive quality and disregard the importance of finding resource-efficient solutions. To push resource awareness further, our work explicitly explores how measures such as running time or energy consumption can be better considered in AutoML. Firstly, we propose a novel method for algorithm selection that balances multiple performance aspects (including resource demand) as prioritized by the user with the help of compositional meta-learning. Secondly, to foster research on green meta-learning and AutoML, we release the MetaQuRe data set, which contains information on predictive (Qu)ality and (Re)source consumption of models evaluated across hundreds of data sets and four execution environments. We use this data to put our methodology into practice and conduct an in-depth analysis of how our approach and data set can help in making AutoML more resource-aware, which represents our third contribution. Lastly, we publish MetaQuRe alongside an extensive code base, allowing for reproducing all results, expanding our data with results from custom environments, and exploring MetaQuRe interactively. In short, our work demonstrates both the importance as well as benefits of rethinking AutoML and meta-learning in a resource-aware way, thus paving the path for making future ML solutions more sustainable.

MCML Authors

Marcel Wever

Dr.

A3 | Computational Models
→ Group Eyke Hüllermeier

* Former Member

S. Gilhuber, A. Beer, Y. Ma and T. Seidl.
FALCUN: A Simple and Efficient Deep Active Learning Strategy.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

We propose FALCUN, a novel deep batch active learning method that is label- and time-efficient. Our proposed acquisition uses a natural, self-adjusting balance of uncertainty and diversity: It slowly transitions from emphasizing uncertain instances at the decision boundary to emphasizing batch diversity. In contrast, established deep active learning methods often have a fixed weighting of uncertainty and diversity, limiting their effectiveness over diverse data sets exhibiting different characteristics. Moreover, to increase diversity, most methods demand intensive search through a deep neural network’s high-dimensional latent embedding space. This leads to high acquisition times when experts are idle while waiting for the next batch for annotation. We overcome this structural problem by exclusively operating on the low-dimensional probability space, yielding much faster acquisition times without sacrificing label efficiency. In extensive experiments, we show FALCUN’s suitability for diverse use cases, including medical images and tabular data. Compared to state-of-the-art methods like BADGE, CLUE, and AlfaMix, FALCUN consistently excels in quality and speed: while FALCUN is among the fastest methods, it has the highest average label efficiency.

MCML Authors

Sandra Gilhuber (née Obermeier)

A3 | Computational Models
→ Group Thomas Seidl

Database Systems and Data Mining AI Lab

Yunpu Ma

Dr.

A3 | Computational Models
→ Group Volker Tresp

Database Systems and Data Mining AI Lab

Thomas Seidl

Prof. Dr.

A3 | Computational Models

Database Systems and Data Mining AI Lab

P. Jahn, C. M. M. Frey, A. Beer, C. Leiber and T. Seidl.
Data with Density-Based Clusters: A Generator for Systematic Evaluation of Clustering Algorithms.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI GitHub

Abstract

Mining data containing density-based clusters is well-established and widespread but faces problems when it comes to systematic and reproducible comparison and evaluation. Although the success of clustering methods hinges on data quality and availability, reproducibly generating suitable data for this setting is not easy, leading to mostly low-dimensional toy datasets being used. To resolve this issue, we propose DENSIRED (DENSIty-based Reproducible Experimental Data), a novel data generator for data containing density-based clusters. It is highly flexible w.r.t. a large variety of properties of the data and produces reproducible datasets in a two-step approach. First, skeletons of the clusters are constructed following a random walk. In the second step, these skeletons are enriched with data samples. DENSIRED enables the systematic generation of data for a robust and reliable analysis of methods aimed toward examining data containing density-connected clusters. In extensive experiments, we analyze the impact of user-defined properties on the generated datasets and the intrinsic dimensionalities of synthesized clusters.

MCML Authors

Philipp Jahn

A3 | Computational Models
→ Group Thomas Seidl

Database Systems and Data Mining AI Lab

Collin Leiber

Dr.

A3 | Computational Models
→ Group Thomas Seidl

* Former Member

Thomas Seidl

Prof. Dr.

A3 | Computational Models

Database Systems and Data Mining AI Lab

Y. Liu, E. Nie, S. Feng, Z. Hua, Z. Ding, D. Wang, Y. Zhang and H. Schütze.
A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI GitHub

Abstract

Current state-of-the-art dialogue systems heavily rely on extensive training datasets. However, challenges arise in domains where domain-specific training datasets are insufficient or entirely absent. To tackle this challenge, we propose a novel data Augmentation framework for Multi-Domain Dialogue Generation, referred to as AMDG. The AMDG framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training. We posit that domain corpora are a blend of domain-agnostic and domain-specific features, with certain representation patterns shared among diverse domains. Domain-agnostic training aims to enable models to learn these common expressive patterns. To construct domain-agnostic dialogue corpora, we employ a de-domaining data processing technique used to remove domain-specific features. By mitigating the effects of domain-specific features, the model trained on the de-domained corpora can effectively learn common expression patterns in different domains. Subsequently, we adapt the learned domain-agnostic features to the target domain through domain adaptation training. We conduct experiments on Chinese dialogue datasets from five different domains and show that AMDG achieves superior performance compared to both direct training on the target domain corpus and collective training on all five domain corpora. Our work underscores AMDG as a viable alternative solution for low-resource multi-domain dialogue generation.

MCML Authors

Yongkang Liu

Dr.

B2 | Natural Language Processing
→ Group Hinrich Schütze

* Former Member

Ercong Nie

B2 | Natural Language Processing
→ Group Hinrich Schütze

Computational Linguistics

Zifeng Ding

A3 | Computational Models
→ Group Volker Tresp

Database Systems and Data Mining AI Lab

Hinrich Schütze

Prof. Dr.

B2 | Natural Language Processing

Computational Linguistics

F. Stermann, I. Chalkidis, A. Vahidi, B. Bischl and M. Rezaei.
Attention-Driven Dropout: A Simple Method to Improve Self-supervised Contrastive Sentence Embeddings.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

Self-contrastive learning has proven effective for vision and natural language tasks. It aims to learn aligned data representations by encoding similar and dissimilar sentence pairs without human annotation. Therefore, data augmentation plays a crucial role in the learned embedding quality. However, in natural language processing (NLP), creating augmented samples for unsupervised contrastive learning is challenging since random editing may modify the semantic meanings of sentences and thus affect learning good representations. In this paper, we introduce a simple, still effective approach dubbed ADD (Attention-Driven Dropout) to generate better-augmented views of sentences to be used in self-contrastive learning. Given a sentence and a Pre-trained Transformer Language Model (PLM), such as RoBERTa, we use the aggregated attention scores of the PLM to remove the less “informative” tokens from the input. We consider two alternative algorithms based on NAIVEAGGREGATION across layers/heads and ATTENTIONROLLOUT [1]. Our approach significantly improves the overall performance of various self-supervised contrastive-based methods, including SIMCSE [14], DIFFCSE [10], and INFOCSE [33] by facilitating the generation of high-quality positive pairs required by these methods. Through empirical evaluations on multiple Semantic Textual Similarity (STS) and Transfer Learning tasks, we observe enhanced performance across the board.

MCML Authors

Bernd Bischl

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistical Learning and Data Science

Mina Rezaei

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

A. Vahidi, L. Wimmer, H. A. Gündüz, B. Bischl, E. Hüllermeier and M. Rezaei.
Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning.
ECML-PKDD 2024 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024. DOI

Abstract

Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory demands. In addition, the efficiency of a deep ensemble is related to diversity among the ensemble members, which is challenging for large, over-parameterized deep neural networks. Moreover, ensemble learning has not yet seen such widespread adoption for unsupervised learning and it remains a challenging endeavor for self-supervised or unsupervised representation learning. Motivated by these challenges, we present a novel self-supervised training regime that leverages an ensemble of independent sub-networks, complemented by a new loss function designed to encourage diversity. Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty, all achieved with minimal computational overhead compared to traditional deep self-supervised ensembles. To evaluate the effectiveness of our approach, we conducted extensive experiments across various tasks, including in-distribution generalization, out-of-distribution detection, dataset corruption, and semi-supervised settings. The results demonstrate that our method significantly improves prediction reliability. Our approach not only achieves excellent accuracy but also enhances calibration, improving on important baseline performance across a wide range of self-supervised architectures in computer vision, natural language processing, and genomics data.

MCML Authors

Lisa Wimmer

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

Hüseyin Anil Gündüz

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

* Former Member

Bernd Bischl

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistical Learning and Data Science

Eyke Hüllermeier

Prof. Dr.

A3 | Computational Models

Artificial Intelligence and Machine Learning

Mina Rezaei

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

Workshops (2 papers)

M. Muschalik, F. Fumagalli, B. Hammer and E. Hüllermeier.
Explaining Change in Models and Data with Global Feature Importance and Effects.
TempXAI @ECML-PKDD 2024 - Tutorial-Workshop Explainable AI for Time Series and Data Streams at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024). Vilnius, Lithuania, Sep 09-13, 2024. PDF

Abstract

In dynamic machine learning environments, where data streams continuously evolve, traditional explanation methods struggle to remain faithful to the underlying model or data distribution. Therefore, this work presents a unified framework for efficiently computing incremental model-agnostic global explanations tailored for time-dependent models. By extending static model-agnostic methods such as Permutation Feature Importance, SAGE, and Partial Dependence Plots into the online learning context, the proposed framework enables the continuous updating of explanations as new data becomes available. These incremental variants ensure that global explanations remain relevant while minimizing computational overhead. The framework also addresses key challenges related to data distribution maintenance and perturbation generation in online learning, offering time and memory efficient solutions like geometric reservoir-based sampling for data replacement.

MCML Authors

Maximilian Muschalik

A3 | Computational Models
→ Group Eyke Hüllermeier

Artificial Intelligence and Machine Learning

Eyke Hüllermeier

Prof. Dr.

A3 | Computational Models

Artificial Intelligence and Machine Learning

U. Schlegel, G. M. Tavares and T. Seidl.
Towards Explainable Deep Clustering for Time Series Data.
TempXAI @ECML-PKDD 2025 - Workshop Explainable AI for Time Series and Data Streams at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2025). Porto, Portugal, Sep 15-19, 2025. To be published.

Abstract

Deepclustering uncovers hidden patterns and groups incomplex time series data, yet its opaque decision-making limits use in safety-critical settings. This survey offers a structured overview of explainable deep clustering for time series, collecting current methods and their realworld applications. We thoroughly discuss and compare peer-reviewed and preprint papers through application domains across healthcare, finance, IoT, and climate science. Our analysis reveals that most work relies on autoencoder and attention architectures, with limited support for streaming, irregularly sampled, or privacy-preserved series, and interpretability is still primarily treated as an add-on. To push the field forward, we outline six research opportunities: (1) combining complex networks with built-in interpretability; (2) setting up clear, faithfulness-focused evaluation metrics for unsupervised explanations; (3) building explainers that adapt to live data streams; (4) crafting explanations tailored to specific domains; (5) adding human-in-the-loop methods that refine clusters and explanations together; and (6) improving our understanding of how time series clustering models work internally. By making interpretability a primary design goal rather than an afterthought, we propose the groundwork for the next generation of trustworthy deep clustering time series analytics.

MCML Authors