Home  | Publications | TS24a

Enrolment-Based Personalisation for Improving Individual-Level Fairness in Speech Emotion Recognition

MCML Authors

Link to Profile Björn Schuller

Björn Schuller

Prof. Dr.

Principal Investigator

Abstract

The expression of emotion is highly individualistic. However, contemporary speech emotion recognition (SER) systems typically rely on population-level models that adopt a ‘one-size-fits-all’ approach for predicting emotion. Moreover, standard evaluation practices measure performance also on the population level, thus failing to characterise how models work across different speakers. In the present contribution, we present a new method for capitalising on individual differences to adapt an SER model to each new speaker using a minimal set of enrolment utterances. In addition, we present novel evaluation schemes for measuring fairness across different speakers. Our findings show that aggregated evaluation metrics may obfuscate fairness issues on the individual-level, which are uncovered by our evaluation, and that our proposed method can improve performance both in aggregated and disaggregated terms.

inproceedings


INTERSPEECH 2024

25th Annual Conference of the International Speech Communication Association. Kos Island, Greece, Sep 01-05, 2024.
Conference logo
A Conference

Authors

A. TriantafyllopoulosB. W. Schuller

Links

PDF

Research Area

 B3 | Multimodal Perception

BibTeXKey: TS24a

Back to Top