Event data often exhibit complex and diverse patterns, as traces generated by the same process can vary significantly due to a high number of process execution variants. Trace clustering techniques help analyze event logs by partitioning event data into smaller groups based on similarity. However, existing clustering approaches face significant challenges, especially with respect to representational quality, specifically how traces are encoded. Most techniques transform traces into a vector space, which typically results in the loss of essential sequencing information as they often ignore the order in which activities occur. Common approaches count activity occurrences and, therefore, treat traces as unordered collections of events. To address these issues, we propose k-traceoids, a structure-preserving trace clustering framework inspired by the k-means clustering method, which operates directly on traces rather than vector-based representations. Our results demonstrate the effectiveness of k-traceoids in identifying meaningful clusters and show that k-traceoids groups together traces that vectorial approaches would not recognize as similar due to their inability to capture activity order.
inproceedings
BibTeXKey: KTO+25