05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
56 results
Search Results
Item Open Access Task-oriented specialization techniques for entity retrieval(2020) Glaser, Andrea; Kuhn, Jonas (Prof. Dr.)Finding information on the internet has become very important nowadays, and online encyclopedias or websites specialized in certain topics offer users a great amount of information. Search engines support users when trying to find information. However, the vast amount of information makes it difficult to separate relevant from irrelevant facts for a specific information need. In this thesis we explore two areas of natural language processing in the context of retrieving information about entities: named entity disambiguation and sentiment analysis. The goal of this thesis is to use methods from these areas to develop task-oriented specialization techniques for entity retrieval. Named entity disambiguation is concerned with linking referring expressions (e.g., proper names) in text to their corresponding real world or fictional entity. Identifying the correct entity is an important factor in finding information on the internet as many proper names are ambiguous and need to be disambiguated to find relevant information. To that end, we introduce the notion of r-context, a new type of structurally informed context. This r-context consists of sentences that are relevant to the entity only to capture all important context clues and to avoid noise. We then show the usefulness of this r-context by performing a systematic study on a pseudo-ambiguity dataset. Identifying less known named entities is a challenge in named entity disambiguation because usually there is not much data available from which a machine learning algorithm can learn. We propose an approach that uses an aggregate of textual data about other entities which share certain properties with the target entity, and learn information from it by using topic modelling, which is then used to disambiguate the less known target entity. We use a dataset that is created automatically by exploiting the link structure in Wikipedia, and show that our approach is helpful for disambiguating entities without training material and with little surrounding context. Retrieving the relevant entities and information can produce many search results. Thus, it is important to effectively present the information to a user. We regard this step beyond the entity retrieval and employ sentiment analysis, which is used to analyze opinions expressed in text, in the context of effectively displaying information about product reviews to a user. We present a system that extracts a supporting sentence, a single sentence that captures both the sentiment of the author as well as a supportingfact. This supporting sentence can be used to provide users with an easy way to assess information in order to make informed choices quickly. We evaluate our approach by using the crowdsourcing service Amazon Mechanical Turk.Item Open Access Elastic parallel systems for high performance cloud computing(2020) Kehrer, Stefan; Blochinger, Wolfgang (Prof. Dr.)High Performance Computing (HPC) enables significant progress in both science and industry. Whereas traditionally parallel applications have been developed to address the grand challenges in science, as of today, they are also heavily used to speed up the time-to-result in the context of product design, production planning, financial risk management, medical diagnosis, as well as research and development efforts. However, purchasing and operating HPC clusters to run these applications requires huge capital expenditures as well as operational knowledge and thus is reserved to large organizations that benefit from economies of scale. More recently, the cloud evolved into an alternative execution environment for parallel applications, which comes with novel characteristics such as on-demand access to compute resources, pay-per-use, and elasticity. Whereas the cloud has been mainly used to operate interactive multi-tier applications, HPC users are also interested in the benefits offered. These include full control of the resource configuration based on virtualization, fast setup times by using on-demand accessible compute resources, and eliminated upfront capital expenditures due to the pay-per-use billing model. Additionally, elasticity allows compute resources to be provisioned and decommissioned at runtime, which allows fine-grained control of an application's performance in terms of its execution time and efficiency as well as the related monetary costs of the computation. Whereas HPC-optimized cloud environments have been introduced by cloud providers such as Amazon Web Services (AWS) and Microsoft Azure, existing parallel architectures are not designed to make use of elasticity. This thesis addresses several challenges in the emergent field of High Performance Cloud Computing. In particular, the presented contributions focus on the novel opportunities and challenges related to elasticity. First, the principles of elastic parallel systems as well as related design considerations are discussed in detail. On this basis, two exemplary elastic parallel system architectures are presented, each of which includes (1) an elasticity controller that controls the number of processing units based on user-defined goals, (2) a cloud-aware parallel execution model that handles coordination and synchronization requirements in an automated manner, and (3) a programming abstraction to ease the implementation of elastic parallel applications. To automate application delivery and deployment, novel approaches are presented that generate the required deployment artifacts from developer-provided source code in an automated manner while considering application-specific non-functional requirements. Throughout this thesis, a broad spectrum of design decisions related to the construction of elastic parallel system architectures is discussed, including proactive and reactive elasticity control mechanisms as well as cloud-based parallel processing with virtual machines (Infrastructure as a Service) and functions (Function as a Service). To evaluate these contributions, extensive experimental evaluations are presented.Item Open Access Data-efficient and safe learning with Gaussian processes(2020) Schreiter, Jens; Toussaint, Marc (Prof. Dr. rer. nat.)Data-based modeling techniques enjoy increasing popularity in many areas of science and technology where traditional approaches are limited regarding accuracy and efficiency. When employing machine learning methods to generate models of dynamic system, it is necessary to consider two important issues. Firstly, the data-sampling process should induce an informative and representative set of points to enable high generalization accuracy of the learned models. Secondly, the algorithmic part for efficient model building is essential for applicability, usability, and the quality of the learned predictive model. This thesis deals with both of these aspects for supervised learning problems, where the interaction between them is exploited to realize an exact and powerful modeling. After introducing the non-parametric Bayesian modeling approach with Gaussian processes and basics for transient modeling tasks in the next chapter, we dedicate ourselves to extensions of this probabilistic technique to relevant practical requirements in the subsequent chapter. This chapter provides an overview on existing sparse Gaussian process approximations and propose some novel work to increase efficiency and model selection on particularly large training data sets. For example, our sparse modeling approach enables real-time capable prediction performance and efficient learning with low memory requirements. A comprehensive comparison on various real-world problems confirms the proposed contributions and shows a variety of modeling tasks, where approximate Gaussian processes can be successfully applied. Further experiments provide more insight about the whole learning process, and thus a profound understanding of the presented work. In the fourth chapter, we focus on active learning schemes for safe and information-optimal generation of meaningful data sets. In addition to the exploration behavior of the active learner, the safety issue is considered in our work, since interacting with real systems should not result in damages or even completely destroy it. Here we propose a new model-based active learning framework to solve both tasks simultaneously. As basis for the data-sampling process we employ the presented Gaussian process techniques. Furthermore, we distinguish between static and transient experimental design strategies. Both problems are separately considered in this chapter. Nevertheless, the requirements for each active learning problem are the same. This subdivision into a static and transient setting allows a more problem-specific perspective on the two cases, and thus enables the creation of specially adapted active learning algorithms. Our novel approaches are then investigated for different applications, where a favorable trade-off between safety and exploration is always realized. Theoretical results maintain these evaluations and provide respectable knowledge about the derived model-based active learning schemes. For example, an upper bound for the probability of failure of the presented active learning methods is derived under reasonable assumptions. Finally, the thesis concludes with a summary of the investigated machine learning problems and motivate some future research directions.Item Open Access Models for data-efficient reinforcement learning on real-world applications(2021) Dörr, Andreas; Toussaint, Marc (Prof. Dr.)Large-scale deep Reinforcement Learning is strongly contributing to many recently published success stories of Artificial Intelligence. These techniques enabled computer systems to autonomously learn and master challenging problems, such as playing the game of Go or complex strategy games such as Star-Craft on human levels or above. Naturally, the question arises which problems could be addressed with these Reinforcement Learning technologies in industrial applications. So far, machine learning technologies based on (semi-)supervised learning create the most visible impact in industrial applications. For example, image, video or text understanding are primarily dominated by models trained and derived autonomously from large-scale data sets with modern (deep) machine learning methods. Reinforcement Learning, on the opposite side, however, deals with temporal decision-making problems and is much less commonly found in the industrial context. In these problems, current decisions and actions inevitably influence the outcome and success of a process much further down the road. This work strives to address some of the core problems, which prevent the effective use of Reinforcement Learning in industrial settings. Autonomous learning of new skills is always guided by existing priors that allow for generalization from previous experience. In some scenarios, non-existing or uninformative prior knowledge can be mitigated by vast amounts of experience for a particular task at hand. Typical industrial processes are, however, operated in very restricted, tightly calibrated operating points. Exploring the space of possible actions or changes to the process naively on the search for improved performance tends to be costly or even prohibitively dangerous. Therefore, one reoccurring subject throughout this work is the emergence of priors and model structures that allow for efficient use of all available experience data. A promising direction is Model-Based Reinforcement Learning, which is explored in the first part of this work. This part derives an automatic tuning method for one of themostcommonindustrial control architectures, the PID controller. By leveraging all available data about the system’s behavior in learning a system dynamics model, the derived method can efficiently tune these controllers from scratch. Although we can easily incorporate all data into dynamics models, real systems expose additional problems to the dynamics modeling and learning task. Characteristics such as non-Gaussian noise, latent states, feedback control or non-i.i.d. data regularly prevent using off-the-shelf modeling tools. Therefore, the second part of this work is concerned with the derivation of modeling solutions that are particularly suited for the reinforcement learning problem. Despite the predominant focus on model-based reinforcement learning as a promising, data-efficient learning tool, this work’s final part revisits model assumptions in a separate branch of reinforcement learning algorithms. Again, generalization and, therefore, efficient learning in model-based methods is primarily driven by the incorporated model assumptions (e.g., smooth dynamics), which real, discontinuous processes might heavily violate. To this end, a model-free reinforcement learning is presented that carefully reintroduces prior model structure to facilitate efficient learning without the need for strong dynamic model priors. The methods and solutions proposed in this work are grounded in the challenges experienced when operating with real-world hardware systems. With applications on a humanoid upper-body robot or an autonomous model race car, the proposed methods are demonstrated to successfully model and master their complex behavior.Item Open Access Analyzing code corpora to improve the correctness and reliability of programs(2021) Patra, Jibesh; Pradel, Michael (Prof. Dr.)Bugs in software are commonplace, challenging, and expensive to deal with. One widely used direction is to use program analyses and reason about software to detect bugs in them. In recent years, the growth of areas like web application development and data analysis has produced large amounts of publicly available source code corpora, primarily written in dynamically typed languages, such as Python and JavaScript. It is challenging to reason about programs written in such languages because of the presence of dynamic features and the lack of statically declared types. This dissertation argues that, to build software developer tools for detecting and understanding bugs, it is worthwhile to analyze code corpora, which can uncover code idioms, runtime information, and natural language constructs such as comments. The dissertation is divided into three corpus-based approaches that support our argument. In the first part, we present static analyses over code corpora to generate new programs, to perform mutations on existing programs, and to generate data for effective training of neural models. We provide empirical evidence that the static analyses can scale to thousands of files and the trained models are useful in finding bugs in code. The second part of this dissertation presents dynamic analyses over code corpora. Our evaluations show that the analyses are effective in uncovering unexpected behaviors when multiple JavaScript libraries are included together and to generate data for training bug-finding neural models. Finally, we show that a corpus-based analysis can be useful for input reduction, which can help developers to find a smaller subset of an input that still triggers the required behavior. We envision that the current dissertation motivates future endeavors in corpus-based analysis to alleviate some of the challenges faced while ensuring the reliability and correctness of software. One direction is to combine data obtained by static and dynamic analyses over code corpora for training. Another direction is to use meta-learning approaches, where a model is trained using data extracted from the code corpora of one language and used for another language.Item Open Access Uncertainty-aware visualization techniques(2021) Schulz, Christoph; Weiskopf, Daniel (Prof. Dr.)Nearly all information is uncertainty-afflicted. Whether and how we present this uncertainty can have a major impact on how our audience perceives such information. Still, uncertainty is rarely visualized and communicated. One difficulty is that we tend to interpret illustrations as truthful. For example, it is difficult to understand that a drawn point’s presence, absence, and location may not convey its full information. Similarly, it may be challenging to classify a point within a probability distribution. One must learn how to interpret uncertainty-afflicted information. Accordingly, this thesis addresses three research questions: How can we identify and reason about uncertainty? What are approaches to modeling flow of uncertainty through the visualization pipeline? Which methods are suitable for harnessing uncertainty? The first chapter is concerned with sources of uncertainty. Then, approaches to model uncertainty using descriptive statistics and unsupervised learning are discussed. Also, a model for validation and evaluation of visualization methods is proposed. Further, methods for visualizing uncertainty-afflicted networks, trees, point data, sequences, and time series are presented. The focus lies on modeling, propagation, and visualization of uncertainty. As encodings of uncertainty, we propose wave-like splines and sampling-based transparency. As an overarching approach to adapt existing visualization methods for uncertain information, we identify the layout process (the placement of objects). The main difficulty is that these objects are not simple points but distribution functions or convex hulls. We also develop two stippling-based methods for rendering that utilize the ability of the human visual system to cope with uncertainty. Finally, I provide insight into possible directions for future research.Item Open Access B-splines on sparse grids for uncertainty quantification(2021) Rehme, Michael F.; Pflüger, Dirk (Prof. Dr.)Item Open Access Automatic term extraction for conventional and extended term definitions across domains(2020) Hätty, Anna; Schulte im Walde, Sabine (apl. Prof. Dr.)A terminology is the entirety of concepts which constitute the vocabulary of a domain or subject field. Automatically identifying various linguistic forms of terms in domain-specific corpora is an important basis for further natural language processing tasks, such as ontology creation or, in general, domain knowledge acquisition. As a short overview for terms and domains, expressions like 'hammer', 'jigsaw', 'cordless screwdriver' or 'to drill' can be considered as terms in the domain of DIY (’do-it-yourself’); 'beaten egg whites' or 'electric blender' as terms in the domain of cooking. These examples cover different linguistic forms: simple terms like 'hammer' and complex terms like 'beaten egg whites', which consist of several simple words. However, although these words might seem to be obvious examples of terms, in many cases the decision to distinguish a term from a ‘non-term’ is not straightforward. There is no common, established way to define terms, but there are multiple terminology theories and diverse approaches to conduct human annotation studies. In addition, terms can be perceived to be more or less terminological, and the hard distinction between term and ‘non-term’ can be unsatisfying. Beyond term definition, when it comes to the automatic extraction of terms, there are further challenges, considering that complex terms as well as simple terms need to be automatically identified by an extraction system. The extraction of complex terms can profit from exploiting information about their constituents because complex terms might be infrequent as a whole. Simple terms might be more frequent, but they are especially prone to ambiguity. If a system considers an assumed term occurrence in text, which actually carries a different meaning, this can lead to wrong term extraction results. Thus, term complexity and ambiguity are major challenges for automatic term extraction. The present work describes novel theoretical and computational models for the considered aspects. It can be grouped into three broad categories: term definition studies, conventional automatic term extraction models, and extended automatic term extraction models that are based on fine-grained term frameworks. Term complexity and ambiguity are special foci here. In this thesis, we report on insights and improvements on these theoretical and computational models for terminology: We find that terms are concepts that can intuitively be derstood by lay people. We test more fine-grained term characterization frameworks that go beyond the conventional term/‘non-term’-distinction. We are the first to describe and model term ambiguity as gradual meaning variation between general and domain-specific language, and use the resulting representations to prevent errors typically made by term extraction systems resulting from ambiguity. We develop computational models that exploit the influence of term constituents on the prediction of complex terms. We especially tackle German closed compound terms, which are a frequent complex term type in German. Finally, we find that we can use similar strategies for modeling term complexity and ambiguity computationally for conventional and extended term extraction.Item Open Access Parallel-Analog/Digital-Umsetzer für Gigabaud-Applikationen(2021) Du, Xuan-Quang; Berroth, Manfred (Prof. Dr.-Ing.)Communication systems with digital signal processors (DSPs) rely on data converters as interface blocks between the analog and the digital domain. The channel data rates in these systems can be increased by choosing a higher symbol rate and/or a more complex modulation format. Both approaches motivate the design of data converters with high sample rates and/or high effective bit resolution. As the improvement of the converter linearity in terms of power efficiency is more difficult to realize, especially at high operation frequencies, current research on ultrahigh data-rate mm-wave communication systems (e.g., 100 Gbit/s wireless communication) focuses on increasing the symbol rate while keeping the modulation format simple (e.g., quadrature phase shift keying). These systems require data converters with nominal bit resolutions of around 4-8 bit and sample rates of more than 25 GS/s. In order to satisfy the future needs for high-speed data converters, new circuit topologies need to be investigated. This work presents the design of a 35.84-GS/s 4-bit analog-to-digital converter (ADC) from its idea to its first silicon implementation. The ADC is based on a single-core flash architecture that makes use of a special traveling-wave signal distribution. Contrary to classical approaches with a power-hungry and area-consuming front-end track-and-hold (T/H), no analog preprocessing is needed. The analog input and the clock signal are rather directly distributed over a pair of delay-matched transmission lines from one comparator to the next adjacent one. Due the spatial location of these components, both signals do not arrive at the same time at every comparator, but as they travel synchronously along the transmission lines, each comparator will always see the same input value at each sampling event. This work gives detailed insight into critical design aspects of this approach and new mathematical models to predict the impact of data-to-clock time skews onto the converter linearity. Furthermore, essential building components (e.g., linear amplifiers, encoder, etc.) and a real-time digital communication interface for multi-gigabit/s data transmission to external devices are presented. The ADC is implemented in a 130-nm SiGe BiCMOS technology from IHP (SG13G2) and exhibits a die area of 1.3 mm^2. For experimental tests, the ADC is wire-bonded on a specially designed radio frequency (RF) printed circuit board. At a sampling rate of 35.84 GS/s, the peak spurious-free dynamic range (SFDR) is 35.4 dBc and the peak signal-to-noise-and-distortion ratio (SNDR) is 24.6 dB (3.8 bit). The effective resolution bandwidth (ERBW) is 14.52 GHz and covers almost the complete first Nyquist frequency band. Up to input frequencies of 20 GHz, a SFDR of more than 26.7 dBc and a SNDR of more than 19.8 dB (3 bit) is achieved. Even at a sample rate of 40.32 GS/s, full Nyquist performance can be demonstrated (SNDR = 18.4 dB @20 GHz). The presented ADC improves the sample rate of current state-of-the-art single-core ADCs by 61% from 25 GS/s to 40 GS/s, making it not only the smallest, but also the fastest reported single-core implementation up to date.Item Open Access Automated generation of tailored load tests for continuous software engineering(2021) Schulz, Henning; Hoorn, André van (Dr.-Ing.)Continuous software engineering (CSE) aims to produce high-quality software through frequent and automated releases of concurrently developed services. By replaying workloads that are representative of the production environment, load testing can identify quality degradation under realistic conditions. The literature proposes several approaches that extract representative workload models from recorded data. However, these approaches contradict CSE's high pace and automation in three aspects: they require manual parameterization, generate resource-intensive system-level load tests, and lack the means to select appropriate periods from the temporally varying production workload to justify time-consuming testing. This dissertation addresses the automated generation of tailored load tests to reduce the time and resources required for CSE-integrated testing. The tailoring needs to consider the services of interest and select the most relevant workload periods based on their context, such as the presence of a special sale when testing a webshop. Also, we intend to support experts and non-experts with a high degree of automation and abstraction. We develop and evaluate description languages, algorithms, and an automated load test generation approach that integrates workload model extraction, clustering, and forecasting. The evaluation comprises laboratory experiments, industrial case studies, an expert survey, and formal proofs. Our results show that representative context-tailored load tests can be generated by learning a workload model incrementally, enriching it with contextual information, and predicting the expected workload using time series forecasting. For further tailoring the load tests to services, we propose extracting call hierarchies from recorded invocation traces. Dedicated models of evolving manual parameterizations automate the generation process and restore the representativeness of the load tests. Furthermore, the integration of our approach with an automated execution framework enables load testing for non-experts. Following open-science practices, we provide supplementary material online. The proposed approach is a suitable solution for the described problem. Future work should refine specific building blocks the approach leverages. These blocks are the clustering and forecasting techniques from existing work, which we have assessed to be limited for predicting sharply fluctuating workloads, such as load spikes.