Home  | Publications | BPM+24

DROPP: Structure-Aware PCA for Ordered Data

MCML Authors

Abstract

Ordered data arises in many areas, e.g., in molecular dynamics and other spatial-temporal trajectories. While data points that are close in this order are related, common dimensionality reduction techniques cannot capture this relation or order. Thus, the information is lost in the low-dimensional representations. We introduce DROPP, which incorporates order into dimensionality reduction by adapting a Gaussian kernel function across the ordered covariances between data points. We find underlying principal components that are characteristic of the process that generated the data. In extensive experiments, we show DROPP’s advantages over other dimensionality reduction techniques on synthetic as well as real-world data sets from molecular dynamics and climate research: The principal components of different data sets that were generated by the same underlying mechanism are very similar to each other. They can, thus, be used for dimensionality reduction with low reconstruction errors along a set of data sets, allowing an explainable visual comparison of different data sets as well as good compression even for unseen data.

inproceedings


ICDE 2024

40th IEEE International Conference on Data Engineering. Utrecht, Netherlands, May 13-17, 2024.
Conference logo
A* Conference

Authors

A. Beer • O. Palotás • A. Maldonado • A. Draganov • I. Assent

Links

DOI

Research Area

 A3 | Computational Models

BibTeXKey: BPM+24

Back to Top