05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 12
  • Thumbnail Image
    ItemOpen Access
    Multilingual Analysis of Food Words
    (2019) Chen, Xu
    In this thesis, we will explore food cultures in British (henceforth English) and Chinese and find a mapping from word embedding space to cultural space. According to situational and embodied theories of cognition, the mental representation of words is heavily influenced by the events and situations in which the corresponding entities are experienced. More specifically, the food experience is strongly depended on our daily life. The environment varies from regions and countries, which in other words is called culture. Thus, an evaluation of food can tell us not only food information but also food-related events. Based on this assumption, we proposed an online survey of 200 different food words in English and Chinese. The collected norms capture various aspects of food that are heavily culture-specific. On the other hand, we use word embedding to predict cultural properties that are more linguistic in their nature. The experimental results can help us to answer the question of whether we can predict cultural behavior through language.
  • Thumbnail Image
    ItemOpen Access
    A visual analytics approach for explainability of deep neural networks
    (2018) Kuznecov, Paul
    Deep Learning has advanced the state-of-the-art in many fields, including machine translation, where Neural Machine Translation (NMT) has become the dominant approach in recent years. However, NMT still faces many challenges such as domain adaption, over- and under-translation, and handling long sentences, making the need for human translators apparent. Additionally, NMT systems pose the problems of explainability, interpretability, and interaction with the user, creating a need for better analytics systems. This thesis introduces NMTVis, an integrated Visual Analytics system for NMT aimed at translators. The system supports users in multiple tasks during translation: finding, filtering and selecting machine-generated translations that possibly contain translation errors, interactive post-editing of machine translations, and domain adaption from user corrections to improve the NMT model. Multiple metrics are proposed as a proxy for translation quality to allow users to quickly find sentences for correction using a parallel coordinates plot. Interactive, dynamic graph visualizations are used to enable exploration and post-editing of translation hypotheses by visualizing beam search and attention weights generated by the NMT model. A web-based user study showed that a majority of participants rated the system positively regarding functional effectiveness, ease of interaction and intuitiveness of visualizations. The user study also revealed a preference for NMTVis over traditional text-based translation systems, especially for large documents. Additionally, automated experiments were conducted which showed that using the system can reduce post-editing effort and improve translation quality for domain-specific documents.
  • Thumbnail Image
    ItemOpen Access
    Neural-based methods for user simulation in dialog systems
    (2018) Schmidt, Maximilian
    Spoken Dialog Systems allow users to interact with a Dialog Manager (DM) using natural language, thereby following a goal to fulfill their task. State-of-the-art solutions cast the problem as Markov Decision Process, leveraging Reinforcement Learning (RL) algorithms to find an optimal dialog strategy for the DM. For this purpose, several thousand dialogs need to be seen by the RL agent. A user simulator comes in handy to generate responses on demand, however the current state-of-the-art agenda-based user simulators lack the ability to model real human subjects. In this thesis, this problem is addressed by implementing a user simulator using a Recurrent Neural Network which approximates the agenda-based model in a first step. Going onwards, it is shown to learn noise and variance treated as varying user behavior. This is used to train the simulator on real data thus modeling real users.
  • Thumbnail Image
    ItemOpen Access
    Exploring simplified subtitles to support spoken language understanding
    (2018) Angerbauer, Katrin
    Understanding spoken language is a crucial skill we need throughout our lives. Yet, it can be difficult for various reasons, especially for those who are hard-of-hearing or just learning to speak a language. Captions or subtitles are a common means to make spoken information accessible. Verbatim transcriptions of talks or lectures are often cumbersome to read, as we generally speak faster than we read. Thus, subtitles are often edited to improve their readability, either manually or automatically. This thesis explores the automatic summarization of sentences and employs the method of sentence compression by deletion with recurrent neural networks. We tackle the task of sentence compression from different directions. On one hand, we look at a technical solution for the problem. On the other hand, we look at the human-centered perspective by investigating the effect of compressed subtitles on comprehension and cognitive load in a user study. Thus, the contribution is twofold: We present a neural network model for sentence compression and the results of a user study evaluating the concept of simplified subtitles. Regarding the technical aspect 60 different configurations of the model were tested. The best-scoring models achieved results comparable to state of the art approaches. We use a Sequence to Sequence architecture together with a compression ratio parameter to control the resulting compression ratio. Thereby, a compression ratio accuracy of 42.1 % was received for the best-scoring model configuration, which can be used as baseline for future experiments in that direction. Results from the 30 participants of the user study show that shortened subtitles could be enough to foster comprehension, but result in higher cognitive load. Based on that feedback we gathered design suggestions to improve future implementations in respect to their usability. Overall, this thesis provides insights on the technological side as well as from the end-user perspective to contribute to an easier access to spoken language.
  • Thumbnail Image
    ItemOpen Access
    Information extraction from social media for route planning
    (2012) Megally, Mirna
    Micro-blogging is an emerging form of communication and became very popular in recent years. Micro-blogging services allow users to publish updates as short text messages that are broadcast to the followers of users in real-time. Twitter is currently the most popular micro-blogging service. It is a rich and real-time information source and a good way to discover interesting content or to follow recent developments. Additionally, the updates published on Twitter public timeline can be retrieved through their API. A significant amount of traffic information exists on Twitter platform. Twitter users tweet when they are in traffic about accidents, road closures or road construction. With this in mind, this paper presents a system that extracts traffic information from Twitter to be used in route planning. Route planning is of increasing importance as societies try to reduce their energy consumption. Furthermore, route planning is concerned with two types of constraints: stable, such as distance between two points and temporary such as weather conditions, traffic jams or road construction. Our system attempt to extract these temporary constraints from Twitter. We train Naive bayes, Maxent and SVM classifiers to filter non relevant traffic. We then apply NER on traffic tweets to extract locations, highwaysand directions. These extracted locations are then geocoded and used in route planning to avoid routes with traffic jams.
  • Thumbnail Image
    ItemOpen Access
    Deep reinforcement learning in dialog systems
    (2018) Väth, Dirk
    This thesis explores advanced deep reinforcement learning methods for learning dialog policies. While many recent contributions in the area of reinforcement learning focus on learning how to play Atari games, this thesis applies them in a real-world scenario. When talking to a dialog system, the dialog policy is the component which chooses the response based on the history of the interaction between user and system. Nowadays, dialog policies may be learned automatically by training a reinforcement learning agent with a user simulator. In this thesis, a baseline method for dialog policy learning is implemented and extended by various state-of-the art deep reinforcement learning methods. An ablation study discusses the significance of each extension, highlighting beneficial and harmful additions. Each extended agent is shown to perform better than the baseline method with all agents outperforming policies from an existing benchmark. Two agents even prove to be on par with handcrafted dialog policies. Along with the quantitative evaluation, qualitative results are provided in the form of chats between a user and a trained agent.
  • Thumbnail Image
    ItemOpen Access
    Conception and implementation of a vocal assistant for the use in vehicle diagnostics
    (2019) Yacoub, Mousa
    One of the most important fields have been researched and developed actively lately is the virtual/vocal assistants field. Developers and companies are extending its ability in a rapid rate. Such assistants could nowadays perform any daily task the user tends to do like booking a flight, ordering products, managing appointments, etc. Since this technology is always extended and developed further, we want in this work to survey how far we can adopt it in the vehicle diagnosis field. Concretely we want to develop an Alexa skill that will be used by technicians at workshops to run the vehicle diagnosis process. Main goal there is to avoid any physical interaction with the diagnosis system and at the same time to increase the usability. That’s the biggest advantage of vocal assistants, as we can operate the diagnosis system vocally through a conversation with the developed Alexa skill without the need to interact with any keyboard/touch-screen enabled systems. The thesis is organized as follows: First we start with an introduction to the topic, where we discuss the motivation and the problem we want to solve. In the second section we will review some fundamentals for a better understanding. In section 3 we review a related research and in section 4 we get to know the used technology in this thesis, particularly how to build an Alexa skill and from which components it consists. Section 5 reviews my solution approach and which steps I took to reach the goal of the thesis. The implementation and technical details are reflected in section 6. To evaluate the thesis and the resulting product, a study has been conducted. The concept and results of this study are discussed in section 7. Last section includes the conclusion and promising approaches that could be developed on top of the result delivered from this thesis to address further limitations.
  • Thumbnail Image
    ItemOpen Access
    Characteristics of Neighbourhood Vector Spaces for Abstract and Concrete Words
    (2019) Bräuninger, Maximilian
    Concreteness and abstractness of words is an important concept in psycholinguistic and computer linguistic. Concrete words can be directly experienced using at least one of the five human senses. Abstract words cannot be experienced directly but have to be described using other words. According to the Context Availability Theory, in order to evoke the meaning of a word it is necessary to create an appropriate context. The theory suggests differences between the concepts of concrete and abstract words. Computational studies have shown evidence regarding the statement of the Context Availability Theory as concrete words seem to appear in specific, small set of contexts, whereas abstract words appear in a broader more general context. Further pursuing this hypothesis, several different neighbourhood similarity measures are used on five different vector space models representing the neighbourhood of concrete and abstract target words. This work presents an in-depth discussion and analysis of characteristics of both concrete and abstract target words regarding their context dimensions as well as other neighbours. Furthermore two different dimensionality reduction techniques are used on the original vector space in order to produce low dimensional representations of the concrete and abstract neighbourhoods.
  • Thumbnail Image
    ItemOpen Access
    Improving speech emotion recognition via generative adversarial networks
    (2019) Bao, Fang
    Speech emotion recognition (SER) is a significant research topic in human-computer interaction. One of the major problems in SER is data scarcity. This master’s thesis aims to investigate a novel data augmentation method based on cycle consistent adversarial networks (CycleGANs). It transfers feature vectors extracted from a unlabeled speech corpus into the domains of target emotions. Furthermore, the CycleGAN framework is extended with a classification loss which improves the discriminability between the generated data. The quality of the synthetic data is evaluated on both within-corpus and cross-corpus experiments of SER. Both show an improvement of classification performance with augmented data. Additionally, two meaningful problems met in our training process are discussed and analyzed.
  • Thumbnail Image
    ItemOpen Access
    How word-embedding methods improve information extraction and can be used for multilingual approaches
    (2018) Zendler, Ulrich
    Expanding entity sets and extracting relations are key tasks in natural language processing (NLP), which is accomplished in various approaches. Recent successful attempts are all using word-embeddings like the ones presented by Mikolov et al. While most work concentrates on how to improve these tasks in general without considering a specific domain, it is of interest how to achieve even higher precisions when focusing on a specific domain and optimizing the methods towards a single purpose. Therefore this thesis suggests methods and adjustments to optimize the proposals for entity set expansion for the domain of drugs. While this is the main purpose of this thesis, it will also present a novel idea, how to improve the precision in relation extraction by using word-embeddings, which could be combined with existing successful relation extraction methods. And finally another key aspect of many international companies is tagged, by presenting a solution for multilingual information extraction system (IES), which is capable of preprocessing text of multiple languages, expanding entity sets independent of the language used and extracting relations on the texts.