Home  | Publications | YLN+25

A Non-Intrusive Speech Quality Evaluation Framework for Hearing Aids Based on Speech Label Assistance and Multi-Task Learning Strategy

MCML Authors

Link to Profile Björn Schuller

Björn Schuller

Prof. Dr.

Principal Investigator

Abstract

Accurate evaluation of hearing aid speech quality is crucial for optimizing the auditory experience of hearing-impaired people. Aiming at the shortcomings of existing methods that rely on clean reference signals and do not take into account the effects of differences in Prescription Formula (PF), this paper proposes a non-intrusive speech quality evaluation framework based on speech label assistance, and multi-task learning strategy, termed MTSE-LA. The framework effectively mitigates evaluation bias caused by PF variations and effectively improves the prediction accuracy of speech quality metrics. MTSE-LA consists of three core modules: a feature extraction module, a label classification module, and a score prediction module. The feature extraction module extracts deep frame-level features from speech using a joint Convolutional Neural Network and Bidirectional Long Short-term Memory network (CNN-BiLSTM) model. The label classification module, acting as a pre-trained network, identifies PF labels and embeds them into the extracted frame-level features, which are then fed into the speech quality prediction branch of the multi-task score prediction module. Effective prediction of speech intelligibility is achieved by introducing the output vectors of the modulation filter bank to the speech intelligibility prediction branch to ensure synergy in the multi-task learning process. Moreover, each prediction branch uses the multi-head self-attention mechanism to capture contextual information and model the importance of speech frames. Experimental results demonstrate that MTSE-LA considerably improves the prediction accuracy of the Hearing Aid Speech Quality Index (HASQI) under multiple PF configurations and different degrees of hearing loss conditions. Compared with existing cutting-edge methods, the proposed framework exhibits higher correlation and fitting accuracy, establishing its reliability and superiority in the field of non-intrusive speech quality evaluation for hearing aids.

article


IEEE Transactions on Audio, Speech and Language Processing

Early Access. Jul. 2025.
Top Journal

Authors

Y. Yang • R. Liang • Y. Ni • Y. Xie • C. Zou • B. W. Schuller

Links

DOI

Research Area

 B3 | Multimodal Perception

BibTeXKey: YLN+25

Back to Top