Home  | Publications | BDH+23

Connecting the Dots — Density-Connectivity Distance Unifies DBSCAN, K-Center and Spectral Clustering

MCML Authors

Abstract

Despite the popularity of density-based clustering, its procedural definition makes it difficult to analyze compared to clustering methods that minimize a loss function. In this paper, we reformulate DBSCAN through a clean objective function by introducing the density-connectivity distance (dc-dist), which captures the essence of density-based clusters by endowing the minimax distance with the concept of density. This novel ultrametric allows us to show that DBSCAN, k-center, and spectral clustering are equivalent in the space given by the dc-dist, despite these algorithms being perceived as fundamentally different in their respective literatures. We also verify that finding the pairwise dc-dists gives DBSCAN clusterings across all epsilon-values, simplifying the problem of parameterizing density-based clustering. We conclude by thoroughly analyzing density-connectivity and its properties -- a task that has been elusive thus far in the literature due to the lack of formal tools.

inproceedings


KDD 2023

29th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Long Beach, CA, USA, Aug 06-10, 2023.
Conference logo
A* Conference

Authors

A. Beer • A. Draganov • E. Hohma • P. Jahn • C. M. M. Frey • I. Assent

Links

DOI GitHub

Research Area

 A3 | Computational Models

BibTeXKey: BDH+23

Back to Top