Home | Publications | ATL+25

DCMatch - Identify Matching Architectures in Deep Clustering Through Meta-Learning

MCML Authors

Mamdouh Aljoud

→ Group Thomas Seidl
Database Systems, Data Mining and AI

Gabriel Marques Tavares

Dr.

→ Group Thomas Seidl
Database Systems, Data Mining and AI

Thomas Seidl

Prof. Dr.

Director

Database Systems, Data Mining and AI

Abstract

The effectiveness of deepclustering algorithms like DeepEmbedded Clustering (DEC) is heavily influenced by the architecture of the neural network employed. However, selecting an optimal architecture is challenging due to the absence of labels in clustering tasks, which makes traditional Neural Architecture Search (NAS) methods unsuitable. To address this, we propose a novel dataset characterization method specifically tailored for image datasets, combining deep-learning-based and sta tistical feature extraction techniques. By utilizing features extracted from a small subset of images, our method effectively captures both high-level semantic and low-level statistical properties of the data. These dataset characteristics are then employed in a meta-learning framework to recommend autoencoder architectures likely to outperform default configurations. Extensive experiments on 20 image datasets validate the robustness of our approach, achieving improved clustering performance on 16 datasets compared to the baseline configuration.

inproceedings ATL+25

PAKDD 2025

29th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Sydney, Australia, Jun 10-13, 2025.

Authors

M. Aljoud • G. M. Tavares • C. Leiber • T. Seidl

Links

DOI GitHub

Research Area

A3 | Computational Models

BibTeXKey: ATL+25

#p-seidl