Home  | Publications | LNF+24

A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation

MCML Authors

Abstract

Current state-of-the-art dialogue systems heavily rely on extensive training datasets. However, challenges arise in domains where domain-specific training datasets are insufficient or entirely absent. To tackle this challenge, we propose a novel data Augmentation framework for Multi-Domain Dialogue Generation, referred to as AMDG. The AMDG framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training. We posit that domain corpora are a blend of domain-agnostic and domain-specific features, with certain representation patterns shared among diverse domains. Domain-agnostic training aims to enable models to learn these common expressive patterns. To construct domain-agnostic dialogue corpora, we employ a de-domaining data processing technique used to remove domain-specific features. By mitigating the effects of domain-specific features, the model trained on the de-domained corpora can effectively learn common expression patterns in different domains. Subsequently, we adapt the learned domain-agnostic features to the target domain through domain adaptation training. We conduct experiments on Chinese dialogue datasets from five different domains and show that AMDG achieves superior performance compared to both direct training on the target domain corpus and collective training on all five domain corpora. Our work underscores AMDG as a viable alternative solution for low-resource multi-domain dialogue generation.

inproceedings


ECML-PKDD 2024

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Vilnius, Lithuania, Sep 09-13, 2024.
Conference logo
A Conference

Authors

Y. LiuE. Nie • S. Feng • Z. Hua • Z. Ding • D. Wang • Y. Zhang • H. Schütze

Links

DOI GitHub

Research Areas

 A3 | Computational Models

 B2 | Natural Language Processing

BibTeXKey: LNF+24

Back to Top