The effectiveness of deepclustering algorithms like DeepEmbedded Clustering (DEC) is heavily influenced by the architecture of the neural network employed. However, selecting an optimal architecture is challenging due to the absence of labels in clustering tasks, which makes traditional Neural Architecture Search (NAS) methods unsuitable. To address this, we propose a novel dataset characterization method specifically tailored for image datasets, combining deep-learning-based and sta tistical feature extraction techniques. By utilizing features extracted from a small subset of images, our method effectively captures both high-level semantic and low-level statistical properties of the data. These dataset characteristics are then employed in a meta-learning framework to recommend autoencoder architectures likely to outperform default configurations. Extensive experiments on 20 image datasets validate the robustness of our approach, achieving improved clustering performance on 16 datasets compared to the baseline configuration.
inproceedings
BibTeXKey: ATL+25