Home  | Publications | BDH+23

Connecting the Dots — Density-Connectivity Distance Unifies DBSCAN, K-Center and Spectral Clustering

MCML Authors

Abstract

Despite the popularity of density-based clustering, its procedural definition makes it difficult to analyze compared to clustering methods that minimize a loss function. In this paper, we reformulate DBSCAN through a clean objective function by introducing the density-connectivity distance (dc-dist), which captures the essence of density-based clusters by endowing the minimax distance with the concept of density. This novel ultrametric allows us to show that DBSCAN, k-center, and spectral clustering are equivalent in the space given by the dc-dist, despite these algorithms being perceived as fundamentally different in their respective literatures. We also verify that finding the pairwise dc-dists gives DBSCAN clusterings across all epsilon-values, simplifying the problem of parameterizing density-based clustering. We conclude by thoroughly analyzing density-connectivity and its properties -- a task that has been elusive thus far in the literature due to the lack of formal tools.

inproceedings BDH+23


KDD 2023

29th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Long Beach, CA, USA, Aug 06-10, 2023.
Conference logo
A* Conference

Authors

A. Beer • A. Draganov • E. Hohma • P. Jahn • C. M. M. Frey • I. Assent

Links

DOI GitHub

Research Area

 A3 | Computational Models

BibTeXKey: BDH+23

Back to Top