Home  | Publications | ORH+22

Cross-Modal Common Representation Learning With Triplet Loss Functions

MCML Authors

Link to Profile David Rügamer PI Matchmaking

David Rügamer

Prof. Dr.

Principal Investigator

Link to Profile Bernd Bischl PI Matchmaking

Bernd Bischl

Prof. Dr.

Director

Abstract

Common representation learning (CRL) learns a shared embedding between two or more modalities to improve in a given task over using only one of the modalities. CRL from different data types such as images and time-series data (e.g., audio or text data) requires a deep metric learning loss that minimizes the distance between the modality embeddings. In this paper, we propose to use the triplet loss, which uses positive and negative identities to create sample pairs with different labels, for CRL between image and time-series modalities. By adapting the triplet loss for CRL, higher accuracy in the main (time-series classification) task can be achieved by exploiting additional information of the auxiliary (image classification) task. Our experiments on synthetic data and handwriting recognition data from sensor-enhanced pens show an improved classification accuracy, faster convergence, and a better generalizability.

misc


Preprint

Mar. 2022

Authors

F. OttD. Rügamer • L. Heublein • B. Bischl • C. Mutschler

Links

DOI

Research Area

 A1 | Statistical Foundations & Explainability

BibTeXKey: ORH+22

Back to Top