05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
Search Results
Item Open Access Modeling the position and inflection of verbs in English to German machine translation(2018) Ramm, Anita; Fraser, Alexander (Prof. Dr.)Item Open Access Using morpho-syntactic and semantic information to improve statistical machine translation(2018) Di Marco, Marion; Schulte im Walde, Sabine (PD Dr.)Statistische Maschinelle Übersetzungssystem werden von Wort-alignierten parallelen Corpora abgeleitet und benutzen üblicherweise keine expliziten linguistischen Informationen. Dies kann zu Generalisierungsproblemen führen, besonders wenn morphologisch komplexe Sprachen übersetzt werden. Diese Arbeit untersucht die Integration von linguistischen Informationen in ein Übersetzungssystem, das in eine morphologisch komplexe Sprache übersetzt: basierend auf einem Übersetzungssystem, das die Morphologie der Zielsprache modelliert, werden syntaktische und semantische Informationen in das System integriert, mit dem Ziel, die Modellierung von Subkategorisierung und Präpositionen zu verbessern.Item Open Access Computational approaches for German particle verbs: compositionality, sense discrimination and non-literal language(2018) Köper, Maximilian; Schulte im Walde, Sabine (PD Dr.)Anfangen (to start) is a German particle verb. Consisting of two parts, a base verb ("fangen") and particle ("an"), with potentially many or no intervening words in a sentence, particle verbs are highly frequent constructions with special properties. It has been shown that this type of verb represents a serious problem for language technology, due to particle verbs' ambiguity, ability to occur separate and seemingly unpredictable behaviour in terms of meaning. This dissertation addresses the meaning of German particle verbs via large-scale computational approaches. The three central parts of the thesis are concerned with computational models for the following components: i) compositionality, ii) senses and iii) non-literal language. In the first part of this thesis, we shed light on the phenomena by providing information on the properties of particle verbs, as well as the related and prior literature. In addition, we present the first corpus-driven statistical analysis. We use two different approaches for addressing the modelling of compositionality. For both approaches, we rely on large amounts of textual data with an algebraic model for representation to approximate meaning. We put forward the existing methodology and show that the prediction of compositionality can be improved by considering visual information. We model the particle verb senses based only on huge amounts of texts, without access to other resources. Furthermore, we compare and introduce the methods to find and represent different verb senses. Our findings indicate the usefulness of such sense-specific models. We successfully present the first model for detecting the non-literal language of particle verbs in a running text. Our approach reaches high performance by combining the established techniques from metaphor detection with particle verb-specific information. In the last part of the thesis, we approach the regularities and the meaning shift patterns. Here, we introduce a novel data collection approach for accessing the meaning components, as well as a computational model of particle verb analogy. The experiments reveal typical patterns in domain changes. Our data collection indicates that coherent verbs with the same meaning shift represent rather scarce phenomena. In summary, we provide novel computational models to previously unaddressed problems, and we report incremental improvements in the existing approaches. Across the models, we observe that semantically similar or synonymous base verbs behave similarly when combined with a particle. In addition, our models demonstrate the difficulty of particle verbs. Finally, our experiments suggest the usefulness of external normative emotion and affect ratings.Item Open Access The Taming of the Shrew - non-standard text processing in the Digital Humanities(2018) Schulz, Sarah; Kuhn, Jonas (Prof. Dr.)Natural language processing (NLP) has focused on the automatic processing of newspaper texts for many years. With the growing importance of text analysis in various areas such as spoken language understanding, social media processing and the interpretation of text material from the humanities, techniques and methodologies have to be reviewed and redefined since so called non-standard texts pose challenges on the lexical and syntactic level especially for machine-learning-based approaches. Automatic processing tools developed on the basis of newspaper texts show a decreased performance for texts with divergent characteristics. Digital Humanities (DH) as a field that has risen to prominence in the last decades, holds a variety of examples for this kind of texts. Thus, the computational analysis of the relationships of Shakespeare’s dramatic characters requires the adjustment of processing tools to English texts from the 16th-century in dramatic form. Likewise, the investigation of narrative perspective in Goethe’s ballads calls for methods that can handle German verse from the 18th century. In this dissertation, we put forward a methodology for NLP in a DH environment. We investigate how an interdisciplinary context in combination with specific goals within projects influences the general NLP approach. We suggest thoughtful collaboration and increased attention to the easy applicability of resulting tools as a solution for differences in the store of knowledge between project partners. Projects in DH are not only constituted by the automatic processing of texts but are usually framed by the investigation of a research question from the humanities. As a consequence, time limitations complicate the successful implementation of analysis techniques especially since the diversity of texts impairs the transferability and reusability of tools beyond a specific project. We answer to this with modular and thus easily adjustable project workflows and system architectures. Several instances serve as examples for our methodology on different levels. We discuss modular architectures that balance time-saving solutions and problem-specific implementations on the example of automatic postcorrection of the output text from an optical character recognition system. We address the problem of data diversity and low resource situations by investigating different approaches towards non-standard text processing. We examine two main techniques: text normalization and tool adjustment. Text normalization aims at the transformation of non-standard text in order to assimilate it to the standard whereas tool adjustment concentrates on the contrary direction of enabling tools to successfully handle a specific kind of text. We focus on the task of part-of-speech tagging to illustrate various approaches toward the processing of historical texts as an instance for non-standard texts. We discuss how the level of deviation from a standard form influences the performance of different methods. Our approaches shed light on the importance of data quality and quantity and emphasize the indispensability of annotations for effective machine learning. In addition, we highlight the advantages of problem-driven approaches where the purpose of a tool is clearly formulated through the research question. Another significant finding to emerge from this work is a summary of the experiences and increased knowledge through collaborative projects between computer scientists and humanists. We reflect on various aspects of the elaboration and formalization of research questions in the DH and assess the limitations and possibilities of the computational modeling of humanistic research questions. An emphasis is placed on the interplay of expert knowledge with respect to a subject of investigation and the implementation of tools for that purpose and the thereof resulting advantages such as the targeted improvement of digital methods through purposeful manual correction and error analysis. We show obstacles and chances and give prospects and directions for future development in this realm of interdisciplinary research.Item Open Access A massively parallel combination technique for the solution of high-dimensional PDEs(2018) Heene, Mario; Pflüger, Dirk (Jun.-Prof. Dr.)The solution of high-dimensional problems, especially high-dimensional partial differential equations (PDEs) that require the joint discretization of more than the usual three spatial dimensions and time, is one of the grand challenges in high performance computing (HPC). Due to the exponential growth of the number of unknowns - the so-called curse of dimensionality, it is in many cases not feasible to resolve the simulation domain as fine as required by the physical problem. Although the upcoming generation of exascale HPC systems theoretically provides the computational power to handle simulations that are out of reach today, it is expected that this is only achievable with new numerical algorithms that are able to efficiently exploit the massive parallelism of these systems. The sparse grid combination technique is a numerical scheme where the problem (e.g., a high-dimensional PDE) is solved on different coarse and anisotropic computational grids (so-called component grids), which are then combined to approximate the solution with a much higher target resolution than any of the individual component grids. This way, the total number of unknowns being computed is drastically reduced compared to the case when the problem is directly solved on a regular grid with the target resolution. Thus, the curse of dimensionality is mitigated. The combination technique is a promising approach to solve high-dimensional problems on future exascale systems. It offers two levels of parallelism: the component grids can be computed in parallel, independently and asynchronously of each other; and the computation of each component grid can be parallelized as well. This reduces the demand for global communication and synchronization, which is expected to be one of the limiting factors for classical discretization techniques to achieve scalability on exascale systems. Furthermore, the combination technique enables novel approaches to deal with the increasing fault rates expected from these systems. With the fault-tolerant combination technique it is possible to recover from failures without time-consuming checkpoint-restart mechanisms. In this work, new algorithms and data structures are presented that enable a massively parallel and fault-tolerant combination technique for time-dependent PDEs on large-scale HPC systems. The scalability of these algorithms is demonstrated on up to 180225 processor cores on the supercomputer Hazel Hen. Furthermore, the parallel combination technique is applied to gyrokinetic simulations in GENE, a software for the simulation of plasma microturbulence in fusion devices.Item Open Access Vision-based methods for evaluating visualizations(2018) Netzel, Rudolf; Weiskopf, Daniel (Prof. Dr.)Item Open Access Interactive web-based visualization(2018) Mwalongo, FinianThe visualization of large amounts of data, which cannot be easily copied for processing on a user’s local machine, is not yet a fully solved problem. Remote visualization represents one possible solution approach to the problem, and has long been an important research topic. Depending on the device used, modern hardware, such as high-performance GPUs, is sometimes not available. This is another reason for the use of remote visualization. Additionally, due to the growing global networking and collaboration among research groups, collaborative remote visualization solutions are becoming more important. The additional use of collaborative visualization solutions is eventually due to the growing global networking and collaboration among research groups. The attractiveness of web-based remote visualization is greatly increased by the wide availability of web browsers on almost all devices; these are available today on all systems - from desktop computers to smartphones. In order to ensure interactivity, network bandwidth and latency are the biggest challenges that web-based visualization algorithms have to solve. Despite the steady improvements in available bandwidth, these improvements are still significantly slower than, for example, processor performance, resulting in increasing the impact of this bottleneck. For example, visualization of large dynamic data in low-bandwidth environments can be challenging because it requires continuous data transfer. However, bandwidth improvement alone cannot improve the latency because it is also affected by factors such as the distance between server and client and network utilization. To overcome these challenges, a combination of techniques is needed to customize the individual processing steps of the visualization pipeline, from efficient data representation to hardware-accelerated rendering on the client side. This thesis first deals with related work in the field of remote visualization with a particular focus on interactive web-based visualization and then presents techniques for interactive visualization in the browser using modern web standards such as WebGL and HTML5. These techniques enable the visualization of dynamic molecular data sets with more than one million atoms at interactive frame rates using GPU-based ray casting. Due to the limitations which exist in a browser-based environment, the concrete implementation of the GPU-based ray casting had to be customized. Evaluation of the resulting performance shows that GPU-based techniques enable the interactive rendering of large data sets and achieve higher image quality compared to polygon-based techniques. In order to reduce data transfer times and network latency, and improve rendering speed, efficient approaches for data representation and transmission are used. Furthermore, this thesis introduces a GPU-based volume-ray marching technique based on WebGL 2.0, which uses progressive brick-wise data transfer, as well as multiple levels of detail in order to achieve interactive volume rendering of datasets stored on a server. The concepts and results presented in this thesis contribute to the further spread of interactive web-based visualization. The algorithmic and technological advances that have been achieved form a basis for further development of interactive browser-based visualization applications. At the same time, this approach has the potential for enabling future collaborative visualization in the cloud.Item Open Access Interacting with large high-resolution display workplaces(2018) Lischke, Lars; Schmidt, Albrecht (Prof.)Large visual spaces provide a unique opportunity to communicate large and complex pieces of information; hence, they have been used for hundreds of years for varied content including maps, public notifications and artwork. Understanding and evaluating complex information will become a fundamental part of any office work. Large high-resolution displays (LHRDs) have the potential to further enhance the traditional advantages of large visual spaces and combine them with modern computing technology, thus becoming an essential tool for understanding and communicating data in future office environments. For successful deployment of LHRDs in office environments, well-suited interaction concepts are required. In this thesis, we build an understanding of how concepts for interaction with LHRDs in office environments could be designed. From the human-computer interaction (HCI) perspective three aspects are fundamental: (1) The way humans perceive and react to large visual spaces is essential for interaction with content displayed on LHRDs. (2) LHRDs require adequate input techniques. (3) The actual content requires well-designed graphical user interfaces (GUIs) and suitable input techniques. Perceptions influence how users can perform input on LHRD setups, which sets boundaries for the design of GUIs for LHRDs. Furthermore, the input technique has to be reflected in the design of the GUI. To understand how humans perceive and react to large visual information on LHRDs, we have focused on the influence of visual resolution and physical space. We show that increased visual resolution has an effect on the perceived media quality and the perceived effort and that humans can overview large visual spaces without being overwhelmed. When the display is wider than 2 m users perceive higher physical effort. When multiple users share an LHRD, they change their movement behavior depending whether a task is collaborative or competitive. For building LHRDs consideration must be given to the increased complexity of higher resolutions and physically large displays. Lower screen resolutions provide enough display quality to work efficiently, while larger physical spaces enable users to overview more content without being overwhelmed. To enhance user input on LHRDs in order to interact with large information pieces, we built working prototypes and analyzed their performance in controlled lab studies. We showed that eye-tracking based manual and gaze input cascaded (MAGIC) pointing can enhance target pointing to distant targets. MAGIC pointing is particularly beneficial when the interaction involves visual searches between pointing to targets. We contributed two gesture sets for mid-air interaction with window managers on LHRDs and found that gesture elicitation for an LHRD was not affected by legacy bias. We compared shared user input on an LHRD with personal tablets, which also functioned as a private working space, to collaborative data exploration using one input device together for interacting with an LHRD. The results showed that input with personal tablets lowered the perceived workload. Finally, we showed that variable movement resistance feedback enhanced one-dimensional data input when no visual input feedback was provided. We concluded that context-aware input techniques enhance the interaction with content displayed on an LHRD so it is essential to provide focus for the visual content and guidance for the user while performing input. To understand user expectations of working with LHRDs we prototyped with potential users how an LHRD work environment could be designed focusing on the physical screen alignment and the placement of content on the display. Based on previous work, we implemented novel alignment techniques for window management on LHRDs and compared them in a user study. The results show that users prefer techniques, that enhance the interaction without breaking well-known desktop GUI concepts. Finally, we provided the example of how an application for browsing scientific publications can benefit from extended display space. Overall, we show that GUIs for LHRDs should support the user more strongly than GUIs for smaller displays to arrange content meaningful or manage and understand large data sets, without breaking well-known GUI-metaphors. In conclusion, this thesis adopts a holistic approach to interaction with LHRDs in office environments. Based on enhanced knowledge about user perception of large visual spaces, we discuss novel input techniques for advanced user input on LHRDs. Furthermore, we present guidelines for designing future GUIs for LHRDs. Our work creates the design space of LHRD workplaces and identifies challenges and opportunities for the development of future office environments.Item Open Access Efficient modeling and computation methods for robust AMS system design(2018) Gil, Leandro; Radetzki, Martin (Prof. Dr.-Ing.)This dissertation copes with the challenge regarding the development of model based design tools that better support the mixed analog and digital parts design of embedded systems. It focuses on the conception of efficient modeling and simulation methods that adequately support emerging system level design methodologies. Starting with a deep analysis of the design activities, many weak points of today’s system level design tools were captured. After considering the modeling and simulation of power electronic circuits for designing low energy embedded systems, a novel signal model that efficiently captures the dynamic behavior of analog and digital circuits is proposed and utilized for the development of computation methods that enable the fast and accurate system level simulation of AMS systems. In order to support a stepwise system design refinement which is based on the essential system properties, behavior computation methods for linear and nonlinear analog circuits based on the novel signal model are presented and compared regarding the performance, accuracy and stability with existing numerical and analytical methods for circuit simulation. The novel signal model in combination with the method proposed to efficiently cope with the interaction of analog and digital circuits as well as the new method for digital circuit simulation are the key contributions of this dissertation because they allow the concurrent state and event based simulation of analog and digital circuits. Using a synchronous data flow model of computation for scheduling the execution of the analog and digital model parts, very fast AMS system simulations are carried out. As the best behavior abstraction for analog and digital circuits may be selected without the need of changing component interfaces, the implementation, validation and verification of AMS systems take advantage of the novel mixed signal representation. Changes on the modeling abstraction level do not affect the experiment setup. The second part of this work deals with the robust design of AMS systems and its verification. After defining a mixed sensitivity based robustness evaluation index for AMS control systems, a general robust design method leading to optimal controller tuning is presented. To avoid over-conservative AMS system designs, the proposed robust design optimization method considers parametric uncertainty and nonlinear model characteristics. The system properties in the frequency domain needed to evaluate the system robustness during parameter optimization are obtained from the proposed signal model. Further advantages of the presented signal model for the computation of control system performance evaluation indexes in the time domain are also investigated in combination with range arithmetic. A novel approach for capturing parameter correlations in range arithmetic based circuit behavior computation is proposed as a step towards a holistic modeling method for the robust design of AMS systems. The several modeling and computation methods proposed to improve the support of design methodologies and tools for AMS system are validated and evaluated in the course of this dissertation considering many aspects of the modeling, simulation, design and verification of a low power embedded system implementing Adaptive Voltage and Frequency Scaling (AVFS) for energy saving.Item Open Access Efficient fault tolerance for selected scientific computing algorithms on heterogeneous and approximate computer architectures(2018) Schöll, Alexander; Wunderlich, Hans-Joachim (Prof. Dr.)Scientific computing and simulation technology play an essential role to solve central challenges in science and engineering. The high computational power of heterogeneous computer architectures allows to accelerate applications in these domains, which are often dominated by compute-intensive mathematical tasks. Scientific, economic and political decision processes increasingly rely on such applications and therefore induce a strong demand to compute correct and trustworthy results. However, the continued semiconductor technology scaling increasingly imposes serious threats to the reliability and efficiency of upcoming devices. Different reliability threats can cause crashes or erroneous results without indication. Software-based fault tolerance techniques can protect algorithmic tasks by adding appropriate operations to detect and correct errors at runtime. Major challenges are induced by the runtime overhead of such operations and by rounding errors in floating-point arithmetic that can cause false positives. The end of Dennard scaling induces central challenges to further increase the compute efficiency between semiconductor technology generations. Approximate computing exploits the inherent error resilience of different applications to achieve efficiency gains with respect to, for instance, power, energy, and execution times. However, scientific applications often induce strict accuracy requirements which require careful utilization of approximation techniques. This thesis provides fault tolerance and approximate computing methods that enable the reliable and efficient execution of linear algebra operations and Conjugate Gradient solvers using heterogeneous and approximate computer architectures. The presented fault tolerance techniques detect and correct errors at runtime with low runtime overhead and high error coverage. At the same time, these fault tolerance techniques are exploited to enable the execution of the Conjugate Gradient solvers on approximate hardware by monitoring the underlying error resilience while adjusting the approximation error accordingly. Besides, parameter evaluation and estimation methods are presented that determine the computational efficiency of application executions on approximate hardware. An extensive experimental evaluation shows the efficiency and efficacy of the presented methods with respect to the runtime overhead to detect and correct errors, the error coverage as well as the achieved energy reduction in executing the Conjugate Gradient solvers on approximate hardware.