Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-13825
Langanzeige der Metadaten
DC ElementWertSprache
dc.contributor.authorTessadri, Wolfgang-
dc.date.accessioned2023-12-14T09:34:28Z-
dc.date.available2023-12-14T09:34:28Z-
dc.date.issued2023de
dc.identifier.other1876971134-
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-138442de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/13844-
dc.identifier.urihttp://dx.doi.org/10.18419/opus-13825-
dc.description.abstractWith the advent of modern chat applications, an increasing number of German dialect speakers use their dialects for written communication. The DiDi Facebook corpus (Frey et al. 2016) captures this phenomenon for South Tyrolean dialects. While the authors included a dialect/standard variety tag on the posting level, a third of these tags was undefined. By training DeBERTa and XLM-RoBERTa for dialect/standard classification we reduce these undefined instances by over 75%. We also use XLM-RoBERTa to add explicit variety labels to individual tokens. By performing a linear regression analysis of socio-linguistic variables and a label-derived dialectality metric we show that the generated labels are highly meaningful. Finally, we describe how the implemented Transformer models can be applied to gather geo-referenced dialect samples on Twitter and we discuss how this data can enrich future dialectometric research.en
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.subject.ddc400de
dc.titleEnhancing a German dialect corpus with neural methodsen
dc.typemasterThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Maschinelle Sprachverarbeitungde
ubs.publikation.seiten128de
ubs.publikation.typAbschlussarbeit (Master)de
Enthalten in den Sammlungen:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
MA_thesis_Tessadri.pdf1,96 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.