05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
Search Results
Item Open Access Causal models for decision making via integrative inference(2017) Geiger, Philipp; Toussaint, Marc (Prof. Dr.)Understanding causes and effects is important in many parts of life, especially when decisions have to be made. The systematic inference of causal models remains a challenge though. In this thesis, we study (1) "approximative" and "integrative" inference of causal models and (2) causal models as a basis for decision making in complex systems. By "integrative" here we mean including and combining settings and knowledge beyond the outcome of perfect randomization or pure observation for causal inference, while "approximative" means that the causal model is only constrained but not uniquely identified. As a basis for the study of topics (1) and (2), which are closely related, we first introduce causal models, discuss the meaning of causation and embed the notion of causation into a broader context of other fundamental concepts. Then we begin our main investigation with a focus on topic (1): we consider the problem of causal inference from a non-experimental multivariate time series X, that is, we integrate temporal knowledge. We take the following approach: We assume that X together with some potential hidden common cause - "confounder" - Z forms a first order vector autoregressive (VAR) process with structural transition matrix A. Then we examine under which conditions the most important parts of A are identifiable or approximately identifiable from only X, in spite of the effects of Z. Essentially, sufficient conditions are (a) non-Gaussian, independent noise or (b) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (a) and (b), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X. Still focusing on topic (1) but already including elements of topic (2), we consider the problem of approximate inference of the causal effect of a variable X on a variable Y in i.i.d. settings "between" randomized experiments and observational studies. Our approach is to first derive approximations (upper/lower bounds) on the causal effect, in dependence on bounds on (hidden) confounding. Then we discuss several scenarios where knowledge or beliefs can be integrated that in fact imply bounds on confounding. One example is about decision making in advertisement, where knowledge on partial compliance with guidelines can be integrated. Then, concentrating on topic (2), we study decision making problems that arise in cloud computing, a computing paradigm and business model that involves complex technical and economical systems and interactions. More specifically, we consider the following two problems: debugging and control of computing systems with the help of sandbox experiments, and prediction of the cost of "spot" resources for decision making of cloud clients. We first establish two theoretical results on approximate counterfactuals and approximate integration of causal knowledge, which we then apply to the two problems in toy scenarios.Item Open Access Änderungstolerante Serialisierung großer Datensätze für mehrsprachige Programmanalysen(2017) Felden, Timm; Plödereder, Erhard (Prof. Dr.)Item Open Access Structurally informed methods for improved sentiment analysis(2017) Kessler, Stefanie Wiltrud; Kuhn, Jonas (Prof. Dr.)Sentiment analysis deals with methods to automatically analyze opinions in natural language texts, e.g., product reviews. Such reviews contain a large number of fine-grained opinions, but to automatically extract detailed information it is necessary to handle a wide variety of verbalizations of opinions. The goal of this thesis is to develop robust structurally informed models for sentiment analysis which address challenges that arise from structurally complex verbalizations of opinions. In this thesis, we look at two examples for such verbalizations that benefit from including structural information into the analysis: negation and comparisons. Negation directly influences the polarity of sentiment expressions, e.g., while "good" is positive, "not good" expresses a negative opinion. We propose a machine learning approach that uses information from dependency parse trees to determine whether a sentiment word is in the scope of a negation expression. Comparisons like "X is better than Y" are the main topic of this thesis. We present a machine learning system for the task of detecting the individual components of comparisons: the anchor or predicate of the comparison, the entities that are compared, which aspect they are compared in, and which entity is preferred. Again, we use structural context from a dependency parse tree to improve the performance of our system. We discuss two ways of addressing the issue of limited availability of training data for our system. First, we create a manually annotated corpus of comparisons in product reviews, the largest such resource available to date. Second, we use the semi-supervised method of structural alignment to expand a small seed set of labeled sentences with similar sentences from a large set of unlabeled sentences. Finally, we work on the task of producing a ranked list of products that complements the isolated prediction of ratings and supports the user in a process of decision making. We demonstrate how we can use the information from comparisons to rank products and evaluate the result against two conceptually different external gold standard rankings.Item Open Access Visual analytics of human mobility behavior(2017) Krüger, Robert; Ertl, Thomas (Prof. Dr.)Human mobility plays an important role in many domains of today’s society, such as security, logistics, transportation, urban planning, and geo-marketing. Both, government and industry thus have great interest in understanding mobility patterns and their driving social, economical, and environmental causes and effects. While stakeholders had to rely on manual traffic surveys for a long time, improvements in tracking technology made analyses based on large digital datasets possible. Recently, the omnipresence of mobile devices significantly increased the amounts of collected movement and context data. People are willing to reveal their position, but also further personal details such as visited places, observations, events, news, and sentiments in exchange for personalized services and social networking. This opens up new possibilities for many domains where a semantic mobility understanding is required but also raises major challenges. To reveal a holistic picture, heterogeneous datasets of different services with different resolution and format have to be fused and analyzed. However, social sensing data is vast, has varying scale, is unevenly distributed, and constantly updated. Especially content from social media services is often inconsistent, unreliable, and incomplete, which requires special treatment. Fully automatic mapping approaches are not trustworthy as they do not take into account these uncertainties. At the same time, manual approaches become insufficient with large amounts of data. Even when data is perfectly aligned, analysts cannot purely rely on existing techniques. Answering questions about reasons for movement requires a broader perspective that takes into account environmental and social context, the driving forces for human mobility behavior. Visual analytics is an emerging research field to tackle such challenges. It creates added value by combining the processing power and accuracy of machines with human capabilities to perceive information visually. Automatic means are used to fuse and aggregate data and to detect hidden patterns therein. Interactive visualizations allow to explore and query the data and to steer the automatic processes with domain knowledge. This increases trust in data, models, and results, which is especially important when critical decisions need to be made. The strengths of visual analytics have been shown to be particularly advantageous when problems and goals are underspecified and exploratory means are needed to discover yet unknown patterns. This thesis presents novel visual analytics approaches to derive meaning and reasons behind movement, by taking into account the aforementioned characteristics. The approaches are aligned in a holistic process model covering all steps from data retrieval, enrichment, exploration, and verification to externalization of gained knowledge for various fields of application such as electric mobility, event management, and law enforcement. It is shown how data from social media can not only be used to retrieve up-to-date movement information, but also to enrich movement trajectories from other sources with structured and unstructured information about places, events, transactions, and other observations. Through highly interactive visual interfaces analysts can bring in domain knowledge to deal with uncertainties during data fusion and to steer the subsequent semantic analysis. Exploratory and confirmatory analysis techniques are presented to create hypotheses, refine them, and find support in the data. Analysts can discover routines and abnormal behavior with assistance of automatic pattern detection methods to cope with the vast amounts of data. Spatial drill-down is supported by a set-based focus+context technique, while a more abstract visual query language allows to explicitly formulate, extract, and query for movement patterns. The approaches are applied in different scenarios and are integrated in a visual analytics system. Evaluation with experts and novice users, case studies, and comparisons to ground truth data reveal the need and effectiveness of the contributions. Overall, the thesis contributes a visual analytics process for human mobility behavior with novel semantic analysis approaches, ranging from global movements of many to local activities of a few people, for a wide range of application domains.Item Open Access Efficient code offloading techniques for mobile applications(2017) Berg, Florian; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c.)Since the release of the first smart phone from Apple in the year 2007, smart phones in general experience a fast growth of rising popularity. A smart phone typically possesses among others a touchscreen display as user interface, a mobile communication for accessing the Internet, and a System-on-a-Chip as an integrated circuit of required components like a central processing unit. This pervasive computing platform derives its required power from a battery, where an end user runs upon it different kinds of applications like a calendar application or a high-end mobile game. Differing in the usage of the local resources from a battery-operated smart phone, a heavy utilization of local resources like playing a resource-demanding application drains the limited resource of energy in few hours. Despite the constant increase of memory, communication, or processing capabilities of a smart phone since the release in 2007, applications are also getting more and more sophisticated and demanding. As a result, the energy consumed on a smart phone was, still is, and will be its main limiting factor. To prevent the limited resource of energy from a quick exhaustion, researchers propose code offloading for (resource-constrained) mobile devices like smart phones. Code offloading strives for increasing the energy efficiency and execution speed of applications by utilizing a server instance in the infrastructure. To this end, a code offloading approach executes dynamically resource-intensive parts from an application on powerful remote servers in the infrastructure on behalf of a (resource-constrained) mobile device. During the remote execution of a resource-intensive application part on a remote server, a mobile device only waits in idle mode until it receives the result of the application part executed remotely. Instead of executing an application part on its local resources, a (resource-constrained) mobile device benefits from the more powerful resources of a remote server by sending the information required for a remote execution, waiting in idle mode, and receiving the result of the remote execution. The process of offloading code from a (resource-constrained) mobile device to a powerful remote server in the infrastructure, however, faces different problems. For instance, code offloading introduces some overhead for additional computation and communication on a mobile device. Moreover, spontaneous disconnections during a remote execution can cause a higher energy consumption and execution time than a local execution on a mobile device without code offloading. To this end, this dissertation addresses the whole process of offloading code from a mobile device not only to one but also to multiple remote resources, comprising the following steps: 1) First, code offloading has to identify feasible parts from an application for a remote execution, where the distributed execution of the identified application part is more beneficial than its local execution. A feasible part for a remote execution typically has the following properties: A low size of information required for transmission before a remote execution, a resource-intensive computation not accessing local sensors, and a low size of information required for transmission after a remote execution. In the area of identification of application parts for a remote execution, this dissertation presents an approach based on code annotations from application developers that automatically transforms a monolithic execution on a mobile device to a distributed execution on multiple heterogeneous resources. In contrast to related approaches in the literature, the annotation-based approach requires least interventions from application developers and end users, keeping the overhead introduced on a mobile device low. 2) For an application part identified for a remote execution, code offloading has to determine its execution side, executing the application part either on the local resources of a mobile device or on the remote resource at the infrastructure. In the area of determining the execution side for an application part, this dissertation presents the offloading problem, where a mobile device decides whether to execute an application part locally or remotely. Furthermore, this dissertation also presents an approach called "code bubbling" that shifts the decision making into the infrastructure. In contrast to related approaches in the literature, the decision-based approach on a mobile device and the bubbling-based approach minimize the execution time, energy consumption, and monetary cost for an application. 3) To determine the execution side for an application part identified for a remote execution, code offloading has to obtain different parameters from the application, participating resources, and utilized links. In the area of obtaining the information required from an application, this dissertation presents a bit-flipping approach that dynamically flips a bit at the modification of application-related information. Furthermore, this dissertation also presents an offload-aware Application Programming Interface (API) that encapsulates the application-related information required for code offloading. In contrast to related approaches in the literature, the bit-flipping approach and the offload-aware API provide an efficient gathering of information at run-time, keeping the overhead introduced on a mobile device low. 4) Beside the information from an application, code offloading has to obtain further information from participating resources and utilized links. In the area of obtaining the information required from participating resources and utilized links, this dissertation presents the approach of code bubbling, already mentioned above. In contrast to related approaches in the literature, the bubbling-based approach makes the offload decision at the place where the related information occurs, keeping the overhead introduced on a mobile device, participating resources, and utilized links low. 5) In case of a remote execution of an application part, code offloading has to send the information required for a remote execution to the remote resource that subsequently executes the application part on behalf of the mobile device. In the area of sending the required information and executing an application part remotely, this dissertation presents code offloading with a cache on the remote side. The cache on the remote side serves as a collective storage of results for already executed application parts, avoiding a repeated execution of previously run application parts. In contrast to related approaches in the literature, the caching-aware approach increases the efficiency of code offloading, keeping the energy consumption, execution time, and monetary cost low. 6) While a remote resource executes an application part, code offloading has to handle the occurrence of failures like a failure of the remote resource or a disconnection. In the area of handling the occurrence of failures, this dissertation presents a preemptable offloading of code with safe-points. The preemptable offloading of code with safe-points enables an interruption of an offloading process and a corresponding continuation of a remote execution on a mobile device, without abandoning the complete result calculated remotely so far. Based on a preemptable offloading of code with safe-points, this dissertation further presents a predictive offloading of code with safe-points that minimizes the overhead introduced by safe-point'ing and maximizes the efficiency of a deadline-aware offloading. In contrast to related approaches in the literature, the preemptable approach with safe-point'ing increases the robustness of code offloading in case of failures. Furthermore, the predictive approach for safe-point'ing ensures a minimal responsiveness and a maximal efficiency of applications despite failures. 7) At the end of a remote execution of an application part, code offloading has to gather on the remote resource the required information after the execution and send this information to the mobile device. In the area of gathering the required information, a remote resource utilizes the same approaches as a mobile device, already mentioned above (cf. the bit-flipping approach and the offload-aware API). 8) Last, code offloading has to receive on the mobile device the information from a remote resource, install the information on the mobile device, and continue the execution of the application on the mobile device. In the area of installing the information and continuing the execution locally, a mobile device utilizes the approaches already mentioned above (cf. the bit-flipping approach and the offload-aware API).Item Open Access Workload mix definition for benchmarking BPMN 2.0 Workflow Management Systems(2017) Skouradaki, Marigianna; Leymann, Frank (Prof. Dr. Dr. h. c.)Nowadays, enterprises broadly use Workflow Management Systems (WfMSs) to design, deploy, execute, monitor and analyse their automated business processes. Through the years, WfMSs evolved into platforms that deliver complex service oriented applications. In this regard, they need to satisfy enterprise-grade performance requirements, such as dependability and scalability. With the ever-growing number of WfMSs that are currently available in the market, companies are called to choose which product is optimal for their requirements and business models. Benchmarking is an established practice used to compare alternative products and leverages the continuous improvement of technology by setting a clear target in measuring and assessing performance. In particular, for service oriented WfMSs there is not yet a widely accepted standard benchmark available, even if workflow modelling languages such as Web Services Business Process Execution Language (WS-BPEL) and Business Process Model and Notation 2.0 (BPMN 2.0) have been adopted as the de-facto standards. A possible explanation on this deficiency can be given by the inherent architectural complexity of WfMSs and the very large number of parameters affecting their performance. However, the need for a standard benchmark for WfMSs is frequently affirmed by the literature. The goal of the BenchFlow approach is to propose a framework towards the first standard benchmark forassessing and comparing the performance of BPMN 2.0 WfMSs. To this end, the approach addresses a set of challenges spanning from logistic challenges, that are related to the collection of a representative set of usage scenarios,to technical challenges, that concern the specific characteristics of a WfMS. This work focuses on a subset of these challenges dealing with the definition of a representative set of process models and corresponding data that will be given as an input to the benchmark. This set of representative process models and corresponding data are referred to as the workload mix of the benchmark. More particularly, we first prepare the theoretical background for defining a representative workload mix. This is accomplished through identification of the basic components of a workload model for WfMS benchmarks, as well as the investigation of the impact of the BPMN 2.0 language constructs to the WfMS’s performance, by means of introducing the first BPMN 2.0 micro-benchmark. We proceed by collecting real-world process models for the identification of a representative workload mix. Therefore, the collection is analysed with respect to its statistical characteristics and also with a novel algorithm that detects and extracts the reoccurring structural patterns of the collection.The extracted reoccurring structures are then used for generating synthetic process models that reflect the essence of the original collection.The introduced methods are brought together in a tool chain that supports the workload mix generation. As a final step, we applied the proposed methods on a real-world case study, that bases on a collection of thousands of real-world process models and generates a representative workload mix to be used in a benchmark. The results show that the generated workload mix is successful in its application for stressing the WfMSs under test.Item Open Access Improvement of hardware reliability with aging monitors(2017) Liu, Chang; Wunderlich, Hans-Joachim (Prof. Dr.)Item Open Access Issues on distributed caching of spatial data(2017) Lübbe, Carlos; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)Die Menge an digitalen Informationen über Orte hat bis heute rapide zugenommen. Mit der Verbreitung mobiler, internetfähiger Geräte kann nun jederzeit und von überall auf diese Informationen zugegriffen werden. Im Zuge dieser Entwicklung wurden zahlreiche ortsbasierte Anwendungen und Dienste populär. So reihen sich digitale Einkaufsassistenten und Touristeninformationsdienste sowie geosoziale Anwendungen in der Liste der beliebtesten Vertreter. Steigende Benutzerzahlen sowie die rapide wachsenden Datenmengen, stellen ernstzunehmende Herausforderungen für die Anbieter ortsbezogener Informationen dar. So muss der Datenbereitstellungsprozess effizient gestaltet sein, um einen kosteneffizienten Betrieb zu ermöglichen. Darüber hinaus sollten Ressourcen flexibel genug zugeordnet werden können, um Lastungleichgewichte zwischen Systemkomponenten ausgleichen zu können. Außerdem müssen Datenanbieter in der Lage sein, die Verarbeitungskapazitäten mit steigender und fallender Anfragelast zu skalieren. Mit dieser Arbeit stellen wir einen verteilten Zwischenspeicher für ortsbasierte Daten vor. In dem verteilten Zwischenspeicher werden Replika der am häufigsten verwendeten Daten von mehreren unabhängigen Servern im flüchtigen Speicher vorgehalten. Mit unserem Ansatz können die Herausforderungen für Anbieter ortsbezogener Informationen wie folgt addressiert werden: Zunächst sorgt eine speziell für die Zugriffsmuster ortsbezogener Anwendungen konzipierte Zwischenspreicherungsstragie für eine Erhöhung der Gesamteffizienz, da eine erhebliche Menge der zwischengespeicherten Ergebnisse vorheriger Anfragen wiederverwendet werden kann. Darüber hinaus bewirken unsere speziell für den Geo-Kontext entwickelten Lastbalancierungsverfahren den Ausgleich dynamischer Lastungleichgewichte. Letztlich befähigen unsere verteilten Protokolle zur Hinzu- und Wegnahme von Servern die Anbieter ortsbezogener Informationen, die Verarbeitungskapazität steigender oder fallender Anfragelast anzupassen. In diesem Dokument untersuchen wir zunächst die Anforderungen der Datenbereitstellung im Kontext von ortsbasierten Anwendungen. Anschließend diskutieren wir mögliche Entwurfsmuster und leiten eine Architektur für einen verteilten Zwischenspeicher ab. Im Verlauf dieser Arbeit, entstanden mehrere konkrete Implementierungsvarianten, die wir in diesem Dokument vorstellen und miteinander vergleichen. Unsere Evaluation zeigt nicht nur die prinzipielle Machbarkeit, sondern auch die Effektivität von unserem Caching-Ansatz für die Erreichung von Skalierbarkeit und Verfügbarkeit im Kontext der Bereitstellung von ortsbasierten Daten.Item Open Access Visualization of two-phase flow dynamics : techniques for droplet interactions, interfaces, and material transport(2017) Karch, Grzegorz Karol; Ertl, Thomas (Prof. Dr.)Computational visualization allows scientists and engineers to better understand simulation data and gain insights into the studied natural processes. Particularly in the field of computational fluid dynamics, interactive visual presentation is essential in the investigation of physical phenomena related to gases and liquids. To ensure effective analysis, flow visualization techniques must adapt to the advancements in the field of fluid dynamics that benefits substantially from the growing computational power of both commodity desktops and supercomputers on the one hand, and steadily expanding knowledge about fluid physics on the other. A prominent example of these advances can be found in the research of two-phase flow with liquid droplets and jets, where high performance computation and sophisticated algorithms for phase tracking enable well resolved and physically accurate simulations of liquid dynamics. Yet, the field of two-phase flow has remained largely unexplored in visualization research so far, leaving the scientists and engineers with a number of challenges when analyzing the data. These include the difficulty in tracking and investigating topological events in large droplet groups, high complexity of droplet dynamics due to the involved interfaces, and a limited choice of high quality interactive methods for the analysis of related transport phenomena. It is therefore the aim of this thesis to address these challenges by providing a multi-scale approach for the visual investigation of two-phase flow, with the focus on the analysis of droplet interaction, fluid interfaces, and material transport. To address the problem of analyzing highly complex two-phase flow simulations with droplet groups and jets, a linked-view approach with three-dimensional and abstract space-time graph representation of droplet dynamics is proposed. The interactive brushing and linking allows for general exploration of topological events as well as detailed inspection of dynamics in terms of oscillations and rotations of droplets. Another approach further examines the separation of liquid phases by segmenting liquid volumes according to their topological changes in future time. For visualization, boundary surfaces of these volume segments are extracted that reveal intricate details of droplet topology dynamics. Additionally, within this framework, visualization of advected particles corresponding to arbitrarily selected segment provides useful insights into the spatio-temporal evolution of the segment. The analysis of interfaces is necessary to understand the interplay of interface dynamics and the dynamics of droplet interactions. A commonly used technique for interface tracking in the volume of fluid-based simulations is the piecewise linear approximation which, although accurate, can affect the quality of the simulation results. To study the influence of the interface reconstruction on the phase tracking procedure, a visualization method is presented that extracts the interfaces by means of the first-order Taylor approximation, and provides several derived quantities that help assess the simulation results in relation to the interface reconstruction quality. The liquid interface is further investigated from the physical standpoint with an approach based on quantities derived from velocity and surface tension gradients. The developed method supports examination of surface tension forces and their impact on the interface instability, as well as detailed analysis of interface deformation characteristics. A line of research important for engineering applications is the analysis of electric fields on droplet interfaces. It is, however, complicated by higher-order elements used in the simulations to preserve field discontinuities. A visualization method has been developed that correctly visualizes these discontinuities at material boundaries. Additionally, the employed space-time representation of the droplet-insulator contact line reveals characteristics of electric field dynamics. The dynamics of droplets are often examined assuming single-phase flow, for instance when the internal material transport is of interest. From the visualization perspective, this allows for adaption of traditional vector field visualization techniques to the investigation of the studied phenomena. As one such concept, dye based visualization is proposed that extends the transport analysis to advection-diffusion problems, therefore revealing true transport behavior. The employed high quality advection preserves fine details of the dye, while the implementation on graphics processing units ensures interactive visualization. Several streamline-based concepts are applied in space-time representation of 2D unsteady flow. By interpreting time as the third spatial dimension, many 3D streamline-based visualization techniques can be applied to investigate 2D unsteady flow. The introduced vortex core ribbons support the examination of vortical flow behavior by revealing rotation near the core lines. For the study of topological structures, a method has been developed that extracts separatrices implicitly as boundaries of regions with different flow behavior, and therefore avoids potentially complicated explicit extraction of various topological structures. All proposed techniques constitute a novel multi-scale approach for visual analysis of two-phase flow. The analysis of droplet interactions is addressed with visualization of the phenomena leading to breakups and with detailed visual inspection of these breakups. On the interface level, techniques for the interface analysis give insights into the simulation quality, mechanisms behind topology changes, as well as the behavior of electrically charged droplets. Further down the scale, the dye-based visualization, streamline-based concepts for space-time analysis, and the implicit extraction of flow topology allow for the investigation of droplet internal transport as well as general single-phase flow scenarios. The applicability of the proposed methods extends, in a varying degree, beyond the use in two-phase flow. Their usability is demonstrated on data from simulations based on Navier-Stokes equations that exemplify practical problems in the research of fluid dynamics.Item Open Access PDE-based vs. variational methods for perspective shape from shading(2017) Ju, Yong Chul; Bruhn, Andrés (Prof. Dr.)
- «
- 1 (current)
- 2
- 3
- »