05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    ItemOpen Access
    Improving speech emotion recognition via generative adversarial networks
    (2019) Bao, Fang
    Speech emotion recognition (SER) is a significant research topic in human-computer interaction. One of the major problems in SER is data scarcity. This master’s thesis aims to investigate a novel data augmentation method based on cycle consistent adversarial networks (CycleGANs). It transfers feature vectors extracted from a unlabeled speech corpus into the domains of target emotions. Furthermore, the CycleGAN framework is extended with a classification loss which improves the discriminability between the generated data. The quality of the synthetic data is evaluated on both within-corpus and cross-corpus experiments of SER. Both show an improvement of classification performance with augmented data. Additionally, two meaningful problems met in our training process are discussed and analyzed.