05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 70
  • Thumbnail Image
    ItemOpen Access
    Plug-and-play domain adaptation for neural machine translation
    (2023) Kadiķis, Emīls
    Neural machine translation has emerged as a powerful tool, yet its performance heavily relies on training data. In a fast-changing world, dealing with out-of-domain data remains a challenge, prompting the need for adaptable translation systems. While fine-tuning is a proven effective adaptation method, it is not always feasible due to data availability, memory, and computational constraints. This thesis introduces a dynamic plug-and-play method inspired by controllable text generation to enhance machine translation across various domains without fine-tuning. This method, called Plug-and-Play Neural Machine Translation (PPNMT), uses a mono-lingual domain-specific bag-of-words to push the hidden state of the decoder through backrpopogation, making the output more in-domain. The method is tested on two types of domains: formality, gender (where the source language does not make a distinction between these aspects, but the target language does), and fine-grained technical domains (which are more based on topic inherent in the text on both the source and target sides). The method performs reasonably well for adapting the translation to different formality levels and, to a lesser extent, grammatical genders, even with an incredibly simple bag-of-words. However, it struggles with adapting the model to technical domains, and a fine-tuning baseline outperforms the proposed method in anything but very low few-shot settings in all tried domains. Despite that, the method shows some interesting behaviour, adapting to the formality on a level that goes beyond just using formal pronouns.
  • Thumbnail Image
    ItemOpen Access
    Supervised semantic proximity noise and disagreement detection
    (2024) Choppa, Tejaswi
    The quality and reliability of annotated data are crucial for the development of Ma­chine Learning models. In this work, we particularly focus on word sense annotation in context (a.k.a. Word-in-Context, WiC). WiC datasets in real-world contexts of­ten exhibit significant disagreement. As a result, information is lost when instances are discarded during the creation of the gold label by adjudicating the annotations through majority or median judgment. Recent advancements have sought to ad­dress this issue by incorporating disagreement data through novel label aggregation methods (Uma et al., 2022). Modeling this disagreement is important because, in a real-world scenario, we often do not have clean data. We need to predict on samples where high disagreement is expected and which are inherently difficult to categorize. Predicting disagreement can help detect or filter highly complex samples. Through this thesis, we aim to build machine learning models that predict human disagreement in annotated text instances. Moreover, we focus on data with noise instances where annotators cannot confidently assign a label or the data does not fit predefined categories. We aim to measure both disagreement and noise, as they both stem from a common source: ambiguity. By modeling these aspects, we aim to design modeling approaches that predict not only the semantic proximity label but also the annotator disagreement, as well as data noisiness.
  • Thumbnail Image
    ItemOpen Access
    Cross-lingual citations in English papers : a large-scale analysis of prevalence, usage, and impact
    (2021) Saier, Tarek; Färber, Michael; Tsereteli, Tornike
    Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.
  • Thumbnail Image
    ItemOpen Access
    More reliable retrieval augmented generation for domain-specific question-answering through domain-infused soft prompts
    (2025) Nassar, Zeina
    Transformer-based large language models (LLMs) have revolutionized the field of artificial intelligence, enabling advancements in various applications due to their exceptional reasoning capabilities. However, these models often face challenges such as hallucination, outdated knowledge, and non-transparent reasoning, which limit their reliability in critical tasks like question answering (QA). QA systems, especially in domain-specific contexts like car manuals, require high accuracy and reliability to ensure user safety. Achieving this is complicated by the need for precise retrieval and interpretation of domain-specific information, which can be hindered by unfamiliar keywords or ambiguous questions. Open-domain QA relies on external knowledge repositories and typically follows a retriever-reader framework to locate and process evidence. In contrast, domain-specific QA often lacks sufficient gold-standard datasets, making it essential to explore techniques like retrieval-augmented generation (RAG), which combines retrieved context with the question as input to LLMs. While RAG improves grounding, hallucinations still occur when models rely on pre-trained knowledge over retrieved evidence. This thesis investigates the role of prompting techniques in improving faithfulness and reducing hallucinations in domain-specific QA. By testing domain-specific and domain-agnostic discrete prompts as well as soft prompting methods, this work aims to identify strategies for generating more accurate and grounded responses. The study addresses key research questions on the effectiveness of domain-specific information in prompts, the best way to incorporate such information, and whether dynamic soft-prompts outperform static ones in domain-specific QA scenarios. The findings aim to contribute to building more reliable and factual QA systems.
  • Thumbnail Image
    ItemOpen Access
    Exploring the effects of enriched English language input on language model efficiency
    (2024) Zeller, Tom
    Recent years have seen the advent of large-scale language modeling as exemplified by transformer-based models like GPT or variants of the BERT architecture. These models, which are trained on massive datasets and using compute unattainable by actors that are not of the scale of the biggest tech companies, have shown impressive feats of syntactic and semantic understanding. Naturally, interest has risen in making these models more efficient, in terms of compute as well as data requirements. Research in this area can be seen as primarily motivated by two factors: reducing the barrier for smaller actors like research institutes or end consumers to train and execute state-of-the-art models, as well as reducing the carbon footprint of these models. To achieve this goal, model compression techniques like quantization, pruning or distillation are utilized. This work aims to explore a different, less model-centric and more data-centric approach: Modifying the training and inference data, by enriching it with syntactic and semantic information. To this end, a lexical resource is created which maps English words to a form where individual characters represent values of a range of semantic and syntactic features, providing lexical information that is accessible to all model types that operate on tokens at the sub-word or character-level. Different features and methods of representation are discussed, and their effect on model performance is evaluated by pretraining a small GPT-family model and fine-tuning on downstream tasks of the SuperGLUE benchmark. Given a fixed amount of data and compute, the experiments show a performance advantage for a character-level model trained using the enriched data.
  • Thumbnail Image
    ItemOpen Access
    Comparison of distributional and visual nearest neighbors
    (2025) Naber, Sven
    This thesis investigates how semantic concepts are represented across textual and visual embedding spaces, focusing on the abstract-concrete continuum. Using 5,448 English nouns and their embeddings from both distributional language models (e.g., Word2Vec, GloVe) and vision models (e.g., ViT, DINOv2, CLIP), it compares neighborhood structure via a normalized alignment score (NAS). Results show that alignment is primarily driven by input modality rather than model architecture, with strong local overlap for concrete concepts and more diffuse agreement for abstract ones. Mean aggregation of image embeddings improves visual consistency but cannot fully bridge modality-specific limitations. The findings provide a starting point for further exploration of semantic spaces.
  • Thumbnail Image
    ItemOpen Access
    Modeling the evaluative nature of German personal name compounds
    (2023) Deeg, Tana
    German personal name compounds such as Villen-Spahn (’villa-Spahn’), Gold-Rosi (’gold-Rosi’) and Folter-Bush (’torture-Bush’) are a rather infrequent phenomenon in the German language. They have the structure of determinative compounds and serve as a nickname for a usually well-known person. According to Belosevic (2022), personal name compounds are mostly evaluative, i.e. they evaluate the person behind the name in a positive or negative way. Further research on an evaluation across different groups of compounds (politics, showbusiness, sports) is proposed. This work will investigate the evaluative nature of 413 German personal name compounds that mostly have the structure of noun as modifier and last name as head. The 131 corresponding full names will be considered as well, e.g. Jens Spahn would correspond to Villen-Spahn. The context data of compounds and names was collected from Twitter and the Leipzig Corpora Collection. The valence value of these context words, based on a valence database of Köper and Schulte im Walde (2016), will be used to investigate the evaluative nature of compounds in comparison to their names. Furthermore, the relation to and function of the modifier will be examined. The valence values will then be used to verify whether there are noticeable differences between the groups of compounds. Afterwards, a linear regression will be implemented to predict a ’delta’ value: the difference between name valence and compound valence. Several predictor variables such as name valence, compound valence, modifier valence, age, gender, political party and nationality will be used. The results reveal that compounds are both positively and negatively evaluative in comparison to their full name while highlighting the reason why they were created. Compound valence and modifier valence are only partially correlated due to modifiers being involved rather accidentally or interpreted ironically. Lastly, noticable differences between the groups can be observed with politicians being the most negative group regarding their valence values. Conducting the linear regression with different combinations of predictor variables shows that compound valence is a highly significant predictor. Also, other variables such as modifier valence, age or political party are able to compose models that predict the delta value very well.
  • Thumbnail Image
    ItemOpen Access
    Cross-lingual frame comparability : computational and linguistic perspectives
    (2023) Sikos, Jennifer; Padó, Sebastian (Prof. Dr.)
    Frames are descriptions of commonplace scenarios or events. Because they describe everyday scenes, such as buying or eating, it seems reasonable to assume that many frames in one language would carry over directly to other languages. However, the specifics of how that scene is realized can be highly specific to a culture; it is still an open research question as to how well (and how many) frames actually apply across languages. This thesis concerns cross-lingual frame comparability - the degree to which a frame can be transferred from one language to another. It addresses several aspects of frame comparability: what is frame comparability; how a computational system can measure cross-lingual frame comparability; and how frame comparability affects cross-lingual models of frames.
  • Thumbnail Image
    ItemOpen Access
    An attribution method for classification tasks in Siamese models
    (2024) Liu, Mindong
    Explaining the contribution of tokens on classification results in the classification task of two sentences is a challenging problem in natural language processing (NLP). This thesis studies the use of the Integrated Jacobians (IJ) in interpreting multi-class classification models with Siamese models, particularly its application in Natural Language Inference (NLI). The NLI task requires models to understand the logical relationships between two sentences, posing challenges for model interpretability. To address the fact that the original Siamese model was primarily designed for regression tasks, the thesis first expanded Siamese models for classification tasks with bilinear similarity while ensuring that the IJ methods can be utilized. It then adapts two forms of the IJ methods: exact IJ and approximate IJ, to work with newly extended Siamese models. To validate the effectiveness of the extended Siamese models using the IJ meth ods, the thesis conducted experiments on the AllNLI dataset under sentence-BERT framework. The thesis employed four different model configurations and applied both IJ methods to these models. The experimental results demonstrate that the IJ methods effectively provide explanations for us. Finally, the thesis examined the consistency between the explanations provided by the IJ methods and semantic relationships at the lexical and span levels using datasets WordNet and SpanEX. In the analysis, the IJ methods show that the models capture semantic relationships between words and spans, and there is a correlation between these relationships and the model’s predictions. This finding supports the use of the IJ methods to explain the decisions of NLP models.