Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen:
http://dx.doi.org/10.18419/opus-14425
Autor(en): | Rassem, Malak |
Titel: | Neural machine translation of dialectal-dialectal Arabic |
Erscheinungsdatum: | 2023 |
Dokumentart: | Abschlussarbeit (Master) |
Seiten: | 92 |
URI: | http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-144449 http://elib.uni-stuttgart.de/handle/11682/14444 http://dx.doi.org/10.18419/opus-14425 |
Zusammenfassung: | This thesis addresses the challenging task of neural machine translation (NMT) between various Arabic dialects, an area that has received limited focus in the field of natural language processing. The primary aim is to explore and compare different approaches to dialect-dialect translation, including models trained from scratch, fine-tuning pre-trained monolingual models, and fine-tuning pre-trained multilingual models. A comprehensive analysis was conducted to evaluate the effectiveness of an "Everything-to-Everything" model compared to models specifically trained for each translation direction. Additionally, the impact of systematically introducing additional data during the training phase, such as various dialects and Modern Standard Arabic (MSA), was examined. The performance of these models was evaluated using a range of automated metrics (such as BLEU and chrF++) and human evaluation of translation quality for a single target dialect. The study also investigates the correlation between machine translation performance and the mutual intelligibility among Arabic dialects based on a range of linguistic distance measures. The research reveals that fine-tuning a pre-trained monolingual model, AraT5, yields superior performance compared to other approaches, challenging common beliefs about multilingual models in low-resource scenarios. Furthermore, it was found that single-direction models outperform both the everything-to-everything model and the models that incorporated additional data. Moreover, lexical overlap on the type-level achieved higher correlation with the translation quality scores compared to other distance measures. Through human evaluation, the study validates the effectiveness of the developed models. The findings contribute significant insights into the intricacies of NMT between Arabic dialects, providing a foundation for future research in this field. |
Enthalten in den Sammlungen: | 05 Fakultät Informatik, Elektrotechnik und Informationstechnik |
Dateien zu dieser Ressource:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
Thesis_Malak_Rassem.pdf | 2,39 MB | Adobe PDF | Öffnen/Anzeigen |
Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.