05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
Search Results
Item Open Access KGGLDM : Knowledge Graph Guided Diffusion Models for advanced learning(2024) Gupta, AkshatThis thesis explores a novel approach by bridging the gap of diffusion modeling and knowledge graphs, unveiling a potentially groundbreaking direction that serves as the central theme of this work. We propose incorporating knowledge graph guidance into LDM models to augment precise control over sample generation using domain conceptual knowledge.Item Open Access Exploring the effects of enriched English language input on language model efficiency(2024) Zeller, TomRecent years have seen the advent of large-scale language modeling as exemplified by transformer-based models like GPT or variants of the BERT architecture. These models, which are trained on massive datasets and using compute unattainable by actors that are not of the scale of the biggest tech companies, have shown impressive feats of syntactic and semantic understanding. Naturally, interest has risen in making these models more efficient, in terms of compute as well as data requirements. Research in this area can be seen as primarily motivated by two factors: reducing the barrier for smaller actors like research institutes or end consumers to train and execute state-of-the-art models, as well as reducing the carbon footprint of these models. To achieve this goal, model compression techniques like quantization, pruning or distillation are utilized. This work aims to explore a different, less model-centric and more data-centric approach: Modifying the training and inference data, by enriching it with syntactic and semantic information. To this end, a lexical resource is created which maps English words to a form where individual characters represent values of a range of semantic and syntactic features, providing lexical information that is accessible to all model types that operate on tokens at the sub-word or character-level. Different features and methods of representation are discussed, and their effect on model performance is evaluated by pretraining a small GPT-family model and fine-tuning on downstream tasks of the SuperGLUE benchmark. Given a fixed amount of data and compute, the experiments show a performance advantage for a character-level model trained using the enriched data.Item Open Access RAGAR, your falsehood RADAR : RAG-augmented reasoning for political fact-checking using multimodal large language models(2024) Abdul Khaliq, MohammedThe escalating challenge of misinformation, particularly in the context of political discourse, necessitates advanced solutions for fact-checking. This thesis introduces innovative approaches to enhance the reliability and efficiency of multimodal fact-checking through the integration of large language models (LLMs) with Retrieval-augmented Generation (RAG) based advanced reasoning techniques. In the digital era, where misinformation spreads rapidly across various media, including text and images, there's a critical need for robust mechanisms capable of evaluating the veracity of political claims. This work proposes two novel methodologies, Chain of RAG (CoRAG) and Tree of RAG (ToRAG), and their hybrid implementations incorporating Chain of Thought and Chain of Verification. These approaches leverage RAG techniques utilizing multimodal LLMs with reasoning techniques. The approaches are designed to process and assess political claims by considering textual and visual information, providing a comprehensive approach to fact-checking. This thesis explores the implementation of these approaches within a multimodal fact-checking pipeline, highlighting their effectiveness in improving the accuracy of veracity predictions and the generation of explanations. By employing multimodal LLMs adept at analyzing text and images, this research advances the capability of automated systems in identifying and countering misinformation. The experimental evaluation demonstrates that the proposed RAG-augmented Reasoning (RAGAR) techniques outperform existing methods that rely on sub-question generation, offering a promising solution to the challenges of political fact-checking. This thesis contributes to the fields of computational linguistics and political science by providing an effective approach to combat fake news, thereby enhancing the integrity of political discourse in the digital age.Item Open Access Semantic agreement and the agreement hierarchy in large language models of Russian(2024) Kuryanov, IlyaThis thesis investigates the phenomenon of mixed agreement in Russian, where certain nouns denoting professions can trigger both syntactic and semantic agreement. We construct challenge sets testing different aspects of this phenomenon for pre-trained masked language models of Russian, and find that all models considered are able to model the syntactic restrictions on mixed agreement, and, to varying degrees, the preferences for semantic agreement that are observed in natural language use. We also find evidence that the models' behavior on these challenge sets is influenced by gender bias associated with the nouns in question, and that the two kinds of agreement are represented differently in the internal structure of the model.Item Open Access Exploring retrieval-augmented language modeling for material prediction of vehicle components(2024) Wagner, FrederikJüngste Fortschritte im Bereich natural language processing (NLP), insbesondere bei großen Sprachmodellen (large language models, LLMs) wie ChatGPT, zeigen das Potenzial für ihre Anwendung bei einer Vielzahl von Aufgaben in speziellen Domänen. In der Automobilbranche könnten sie beispielsweise zur Unterstützung bei der Reparatur eines Fahrzeugs eingesetzt werden. Diese Arbeit befasst sich mit dem Problem der Vorhersage geeigneter Materialien für Fahrzeugkomponenten, wie z. B. Bremsscheiben. Es soll ermittelt werden, ob LLMs sowohl auf Allgemein- als auch auf domänenspezifisches Wissen zurückgreifen können, um genaue Vorhersagen über Komponentenmaterialien zu treffen, ohne dass eine umfangreiche Feinabstimmung (fine-tuning) erforderlich ist. Erreicht wird dies durch retrieval-augmented generation (RAG), wobei relevante Informationen aus externen Quellen abgerufen und zur Verbesserung der Modelleingabe des LLMs verwendet werden. In dieser Arbeit werden drei Ansätze verglichen: ein Standard-LLM-Modell, ein einfacher RAG-Ansatz und eine iterative RAG-Methode namens Chain-of-Verification (CoVe). In dieser Arbeit wird auch ein eigenes Annotationstool entwickelt, um eine menschliche Evaluierungsstudie zu erleichtern, da es keinen Goldstandard-Datensatz gibt. Die Ergebnisse zeigen, dass LLMs bei der Materialvorhersage gut abschneiden, und obwohl beide RAG-Ansätze die Vorhersagequalität nicht signifikant verbessern, verschlechtern sie sie auch nicht. Diese Forschungsarbeit kommt zu dem Schluss, dass LLMs mit oder ohne Retrieval-Ergänzung eine vielversprechende Lösung für die Materialvorhersage bei Fahrzeugkomponenten bieten, auch wenn es noch Herausforderungen bei der Bewertung, der Hyperparameter-Optimierung und dem Daten-Retrieval gibt.Item Open Access Controllable text-to-speech system : speaking style control using hierarchical variational autoencoder(2024) Yang, Yung-ChingThis research proposes an utterance embedding model that provides disentangling and scalable control over latent attributes in human speech. Our model is formulated as a hierarchical generative model based on the Variational Autoencoder (VAE) framework, integrated with the FastSpeech2 Text-to-Speech (TTS) system. The work demonstrates that image initiative networks on hierarchical pattern learning can be adapted to model complex distributions in speaking styles and prosody. This work merges advancements in VAE research-particularly those addressing critical statistical challenges such as posterior collapse and unbounded KL divergence-with recent studies focusing on structural enhancements of architectures in VAEs. We introduce a hierarchical structure in latent variable modeling and augment the learning objective with hierarchical information to ensure the latent variables at each level are hierarchically factorized. This approach learns the smooth latent prosody space and deepens our understanding of the relationship between the hierarchical nature of prosody and neural network architecture. Through our customized control mechanism, integrated into various levels of the latent spaces, the model is capable of manipulation of prosodic elements, allowing for both independent and scalable adjustments. By incorporating these techniques, our model is capable of capturing a wide range of prosodic variations, offering a refined level of control and expressiveness in speech synthesis in unsupervised learning contexts.Item Open Access Gender identity in language models : an inclusive approach to data creation and probing(2024) Knupleš, UrbanGender identity encompasses a broad spectrum that goes beyond traditional cisnormative views. In applications of pre-trained language models (PLMs), such as identity verification systems, cisnormative practices can harm individuals, for instance, misinterpreting non-binary identities as non-human (Dev et al., 2021). Considering the black-box nature of PLMs, such harmful classification raises questions about the encoded information in the model’s representations. While cisgender identity information is encoded in these representations (Lauscher et al., 2022), the (potentially biased) encoding for transgender and non-binary individuals remains unknown. In this work, we examine the encoding of gender identity information in the representations of PLMs for transgender and non-binary individuals. We first propose a corpus creation pipeline that results in the TRANsCRIPT corpus, containing text from transgender, cisgender, and non-binary individuals. We continue with a sociolinguistic analysis to investigate the differences in language use of the gender identity groups in TRANsCRIPT. Furthermore, we use TRANsCRIPT to explore the encoding of gender identity information in the representations of PLMs by applying probing techniques on their (1) frozen and (2) topic-controlled frozen representations. Finally, we fine-tune the PLMs on an explicit signal. Our findings reveal that gender identity information is encoded in the representations of PLMs for transgender, cisgender and non-binary individuals. We find that the encodings are intrinsically gender-biased. During fine-tuning, this is further amplified into gender-biased predictions. These findings highlight the harmful effects that biased representations in downstream tasks can have on transgender and non-binary individuals. Ultimately, this work highlights the importance of considering transgender and non-binary individuals in the context of developing and assessing language technologies.Item Open Access Active learning strategies for deep learning based question answering models(2024) Lin, Kuan-YuQuestion Answering (QA) systems enable machines to understand human language, requiring robust training on related datasets. Nonetheless, large, high-quality datasets are only sometimes available due to cost restrictions. Active learning (AL) addresses this challenge by selecting the data with high information value as small subsets for model training, considering computational resources while preserving performance. There are many different ways to detect the information value of the data, which in turn leads to a variety of AL strategies. In this study, we aim to investigate the performance change of the QA system after applying various AL strategies. In addition, we use the BatchBALD strategy, compared with its predecessor, the BALD strategy, to inspect the advantages of batch querying in data selection. Eventually, we propose Unique Context Selection (UC) and Unique Embedding Selection Methods (UE) to enhance the sampling effectiveness by ensuring maximal diversity of context and embedding within querying samples, respectively. Observing the experimental results, we learn that each dataset has its own AL strategy that brings out its best results, and there is no universal optimal AL strategy for QA tasks. BatchBALD maintains the modeling results similar to BALD in the regular setting while significantly reducing computation time, though this feature is not practiced in the low-resource setting. Finally, UC could not enhance the effectiveness of AL since half of the datasets used in this study consisted of more than 65% unique contexts. However, the effect of UE enhancement deviates across datasets and AL strategies, but it can be observed that most of the AL strategies with the best effect of UE enhancement can increase by more than 0.5% F1. Compared with context, a feature of datasets is limited to natural language processing tasks; embedding is more generalized and has a good enhancement effect, which is worth studying in depth.Item Open Access Just-in-time pruning of large prompted language models(2024) Bareiß, PatrickPrompting of transformer-based autoregressive large language models (LLMs) is a powerful and ergonomic approach to solve language processing tasks (sentiment analysis, question answering, . . . ) using few data. However for each individual task/prompt it is also wasteful: Only a fraction of the generality in the underlying model is ever used. This raises the question: Is this also reflected in the model in the form of irrelevant model parts that do not affect the task-specific accuracy when pruned away (for instance layers)? Previous work does not address this question at a level that preserves the natural affordances of prompting: (1) Low data requirements and (2) ability to reuse the same deployed model for multiple tasks. We propose a new approach that can identify and remove irrelevant model parts if they exist, which does not require additional data (we generate it instead) and removes irrelevant parts only just before a prompt gets passed as input to the model, or just-in-time. After the prompt has been processed we re-add the removed parts to the model. In this way, we can then reuse the model and remove (potentially different) irrelevant model parts for another prompt. We identify a class of model parts for which pruning/re-adding is efficient and therefore allows for efficient just-in-time pruning. During our experiments we find that irrelevant just-in-time prunable model parts do exist for many prompts (for Mistral-7B and GPT-2-XL) and we can remove them to a substantial degree, reducing FLOPs by up to 46% while preserving the accuracy of the original model.Item Open Access Multilingual prompt engineering via large language models : an approach to sentiment analysis(2024) Huszár, PascalExploring the efficacy of multilingual prompt engineering for sentiment analysis reveals a promising avenue for extending the adaptability of large language models (LLMs) beyond the confines of the primary predominant English. The core ambition revolves around devising strategies for transferring adept English instructions into the target language. These strategies exploit the remarkable capability of large language models to extract information and learn new task by the context of a few demonstrations - known as in-context learning. In this research, the strategies leverage both monolingual and cross-lingual prompt templates, augmented with demonstrations. Furthermore, the process of instruction generation is supported by an iterative rephrasing approach that refines instructions into their optimal counterparts. The investigation unfolds through a careful analysis of how multilingual instruction generation benefits from incorporating demonstrations, either in English or the target language, within the prompt template. Results substantiate that iteratively rephrasing instructions further improves the effectiveness of the instruction generation process, underscoring the proficiency of large language models to follow the request. Through this exploration, it emerges that the automatic prompt engineering methods exhibit potential in multilingual contexts. The findings advocate for a broader utilization of demonstration learning and iterative refinement techniques in multilingual prompt engineering, aiming to universalize the application of large language model across diverse communities and languages. This study not only fills the gap identified in previous research regarding the effectiveness of automatic prompt engineering methods for non-English languages but also facilitates broader access for linguistic communities to generative AI.
- «
- 1 (current)
- 2
- 3
- »