05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
1 results
Search Results
Item Open Access Improving speech emotion recognition via generative adversarial networks(2019) Bao, FangSpeech emotion recognition (SER) is a significant research topic in human-computer interaction. One of the major problems in SER is data scarcity. This master’s thesis aims to investigate a novel data augmentation method based on cycle consistent adversarial networks (CycleGANs). It transfers feature vectors extracted from a unlabeled speech corpus into the domains of target emotions. Furthermore, the CycleGAN framework is extended with a classification loss which improves the discriminability between the generated data. The quality of the synthetic data is evaluated on both within-corpus and cross-corpus experiments of SER. Both show an improvement of classification performance with augmented data. Additionally, two meaningful problems met in our training process are discussed and analyzed.