05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
77 results
Search Results
Item Open Access Eine Methode zum Verteilen, Adaptieren und Deployment partnerübergreifender Anwendungen(2022) Wild, Karoline; Leymann, Frank (Prof. Dr. Dr. h. c.)Ein wesentlicher Aspekt einer effektiven Kollaboration innerhalb von Organisationen, aber vor allem organisationsübergreifend, ist die Integration und Automatisierung der Prozesse. Dazu zählt auch die Bereitstellung von Anwendungssystemen, deren Komponenten von unterschiedlichen Partnern, das heißt Abteilungen oder Unternehmen, bereitgestellt und verwaltet werden. Die dadurch entstehende verteilte, dezentral verwaltete Umgebung bedarf neuer Konzepte zur Bereitstellung. Die Autonomie der Partner und die Verteilung der Komponenten führen dabei zu neuen Herausforderungen. Zum einen müssen partnerübergreifende Kommunikationsbeziehungen realisiert und zum anderen muss das automatisierte dezentrale Deployment ermöglicht werden. Eine Vielzahl von Technologien wurde in den letzten Jahren entwickelt, die alle Schritte von der Modellierung bis zur Bereitstellung und dem Management zur Laufzeit einer Anwendung abdecken. Diese Technologien basieren jedoch auf einer zentralisierten Koordination des Deployments, wodurch die Autonomie der Partner eingeschränkt ist. Auch fehlen Konzepte zur Identifikation von Problemen, die aus der Verteilung von Anwendungskomponenten resultieren und die Funktionsfähigkeit der Anwendung einschränken. Dies betrifft speziell die partnerübergreifenden Kommunikationsbeziehungen. Um diese Herausforderungen zu lösen, stellt diese Arbeit die DivA-Methode zum Verteilen, Adaptieren und Deployment partnerübergreifender Anwendungen vor. Die Methode vereinigt die globalen und lokalen Partneraktivitäten, die zur Bereitstellung partnerübergreifender Anwendungen benötigt werden. Dabei setzt die Methode auf dem deklarativen Essential Deployment Meta Model (EDMM) auf und ermöglicht damit die Einführung deploymenttechnologieunabhängiger Modellierungskonzepte zur Verteilung von Anwendungskomponenten sowie zur Modellanalyse und -adaption. Das Split-and-Match-Verfahren wird für die Verteilung von Anwendungskomponenten basierend auf festgelegten Zielumgebungen und zur Selektion kompatibler Cloud-Dienste vorgestellt. Für die Ausführung des Deployments können EDMM-Modelle in unterschiedliche Technologien transformiert werden. Um die Bereitstellung komplett dezentral durchzuführen, werden deklarative und imperative Technologien kombiniert und basierend auf den deklarativen EDMM-Modellen Workflows generiert, die die Aktivitäten zur Bereitstellung und zum Datenaustausch mit anderen Partnern zur Realisierung partnerübergreifender Kommunikationsbeziehungen orchestrieren. Diese Workflows formen implizit eine Deployment-Choreographie. Für die Modellanalyse und -adaption wird als Kern dieser Arbeit ein zweistufiges musterbasiertes Verfahren zur Problemerkennung und Modelladaption eingeführt. Dafür werden aus den textuellen Musterbeschreibungen die Problem- und Kontextdefinition analysiert und formalisiert, um die automatisierte Identifikation von Problemen in EDMM-Modellen zu ermöglichen. Besonderer Fokus liegt dabei auf Problemen, die durch die Verteilung der Komponenten entstehen und die Realisierung von Kommunikationsbeziehungen verhindern. Das gleiche Verfahren wird auch für die Selektion geeigneter konkreter Lösungsimplementierungen zur Behebung der Probleme angewendet. Zusätzlich wird ein Ansatz zur Selektion von Kommunikationstreibern abhängig von der verwendeten Integrations-Middleware vorgestellt, wodurch die Portabilität von Anwendungskomponenten verbessert werden kann. Die in dieser Arbeit vorgestellten Konzepte werden durch das DivA-Werkzeug automatisiert. Zur Validierung wird das Werkzeug prototypisch implementiert und in bestehende Systeme zur Modellierung und Ausführung des Deployments von Anwendungssystemen integriert.Item Open Access Elastic parallel systems for high performance cloud computing(2020) Kehrer, Stefan; Blochinger, Wolfgang (Prof. Dr.)High Performance Computing (HPC) enables significant progress in both science and industry. Whereas traditionally parallel applications have been developed to address the grand challenges in science, as of today, they are also heavily used to speed up the time-to-result in the context of product design, production planning, financial risk management, medical diagnosis, as well as research and development efforts. However, purchasing and operating HPC clusters to run these applications requires huge capital expenditures as well as operational knowledge and thus is reserved to large organizations that benefit from economies of scale. More recently, the cloud evolved into an alternative execution environment for parallel applications, which comes with novel characteristics such as on-demand access to compute resources, pay-per-use, and elasticity. Whereas the cloud has been mainly used to operate interactive multi-tier applications, HPC users are also interested in the benefits offered. These include full control of the resource configuration based on virtualization, fast setup times by using on-demand accessible compute resources, and eliminated upfront capital expenditures due to the pay-per-use billing model. Additionally, elasticity allows compute resources to be provisioned and decommissioned at runtime, which allows fine-grained control of an application's performance in terms of its execution time and efficiency as well as the related monetary costs of the computation. Whereas HPC-optimized cloud environments have been introduced by cloud providers such as Amazon Web Services (AWS) and Microsoft Azure, existing parallel architectures are not designed to make use of elasticity. This thesis addresses several challenges in the emergent field of High Performance Cloud Computing. In particular, the presented contributions focus on the novel opportunities and challenges related to elasticity. First, the principles of elastic parallel systems as well as related design considerations are discussed in detail. On this basis, two exemplary elastic parallel system architectures are presented, each of which includes (1) an elasticity controller that controls the number of processing units based on user-defined goals, (2) a cloud-aware parallel execution model that handles coordination and synchronization requirements in an automated manner, and (3) a programming abstraction to ease the implementation of elastic parallel applications. To automate application delivery and deployment, novel approaches are presented that generate the required deployment artifacts from developer-provided source code in an automated manner while considering application-specific non-functional requirements. Throughout this thesis, a broad spectrum of design decisions related to the construction of elastic parallel system architectures is discussed, including proactive and reactive elasticity control mechanisms as well as cloud-based parallel processing with virtual machines (Infrastructure as a Service) and functions (Function as a Service). To evaluate these contributions, extensive experimental evaluations are presented.Item Open Access Models for data-efficient reinforcement learning on real-world applications(2021) Dörr, Andreas; Toussaint, Marc (Prof. Dr.)Large-scale deep Reinforcement Learning is strongly contributing to many recently published success stories of Artificial Intelligence. These techniques enabled computer systems to autonomously learn and master challenging problems, such as playing the game of Go or complex strategy games such as Star-Craft on human levels or above. Naturally, the question arises which problems could be addressed with these Reinforcement Learning technologies in industrial applications. So far, machine learning technologies based on (semi-)supervised learning create the most visible impact in industrial applications. For example, image, video or text understanding are primarily dominated by models trained and derived autonomously from large-scale data sets with modern (deep) machine learning methods. Reinforcement Learning, on the opposite side, however, deals with temporal decision-making problems and is much less commonly found in the industrial context. In these problems, current decisions and actions inevitably influence the outcome and success of a process much further down the road. This work strives to address some of the core problems, which prevent the effective use of Reinforcement Learning in industrial settings. Autonomous learning of new skills is always guided by existing priors that allow for generalization from previous experience. In some scenarios, non-existing or uninformative prior knowledge can be mitigated by vast amounts of experience for a particular task at hand. Typical industrial processes are, however, operated in very restricted, tightly calibrated operating points. Exploring the space of possible actions or changes to the process naively on the search for improved performance tends to be costly or even prohibitively dangerous. Therefore, one reoccurring subject throughout this work is the emergence of priors and model structures that allow for efficient use of all available experience data. A promising direction is Model-Based Reinforcement Learning, which is explored in the first part of this work. This part derives an automatic tuning method for one of themostcommonindustrial control architectures, the PID controller. By leveraging all available data about the system’s behavior in learning a system dynamics model, the derived method can efficiently tune these controllers from scratch. Although we can easily incorporate all data into dynamics models, real systems expose additional problems to the dynamics modeling and learning task. Characteristics such as non-Gaussian noise, latent states, feedback control or non-i.i.d. data regularly prevent using off-the-shelf modeling tools. Therefore, the second part of this work is concerned with the derivation of modeling solutions that are particularly suited for the reinforcement learning problem. Despite the predominant focus on model-based reinforcement learning as a promising, data-efficient learning tool, this work’s final part revisits model assumptions in a separate branch of reinforcement learning algorithms. Again, generalization and, therefore, efficient learning in model-based methods is primarily driven by the incorporated model assumptions (e.g., smooth dynamics), which real, discontinuous processes might heavily violate. To this end, a model-free reinforcement learning is presented that carefully reintroduces prior model structure to facilitate efficient learning without the need for strong dynamic model priors. The methods and solutions proposed in this work are grounded in the challenges experienced when operating with real-world hardware systems. With applications on a humanoid upper-body robot or an autonomous model race car, the proposed methods are demonstrated to successfully model and master their complex behavior.Item Open Access Evaluation and control of the value provision of complex IoT service systems(2022) Niedermaier, Sina; Wagner, Stefan (Prof. Dr.)The Internet of Things (IoT) represents an opportunity for companies to create additional consumer value through merging connected products with software-based services. The quality of the IoT service can determine whether an IoT service is consumed in the long-term and whether it delivers the expected value for a consumer. Since IoT services are usually provided by distributed systems and their operations are becoming increasingly complex and dynamic, continuous monitoring and control of the value provision is necessary. The individual components of IoT service systems are usually developed and operated by specialized teams in a division of labor. With the increasing specialization of the teams, practitioners struggle to derive quality requirements based on consumer needs. Consequently, the teams often observe the behavior of “their” components isolated without relation to value provision to a consumer. Inadequate monitoring and control of the value provision across the different components of an IoT system can result in quality deficiencies and a loss of value for the consumer. The goal of this dissertation is to support organizations with concepts and methods in the development and operations of IoT service systems to ensure the quality of the value provision to a consumer. By applying empirical methods, we first analyzed the challenges and applied practices in the industry as well as the state of the art. Based on the results, we refined existing concepts and approaches. To evaluate their quality in use, we conducted action research projects in collaboration with industry partners. Based on an interview study with industry experts, we have analyzed the current challenges, requirements, and applied solutions for the operations and monitoring of distributed systems in more detail. The findings of this study form the basis for further contributions of this thesis. To support and improve communication between the specialized teams in handling quality deficiencies, we have developed a classification for system anomalies. We have applied and evaluated this classification in an action research project in industry. It allows organizations to differentiate and adapt their actions according to different classes of anomalies. Thus, quick and effective actions to ensure the value provision or minimize the loss of value can be optimized separately from actions in the context of long-term and sustainable correction of the IoT system. Moreover, the classification for system anomalies enables the organization to create feedback loops for quality improvement of the system, the IoT service, and the organization. To evaluate the delivered value of an IoT service, we decompose it into discrete workflows, so-called IoT transactions. Applying distributed tracing, the dynamic behavior of an IoT transaction can be reconstructed in a further activity and can be made “observable”. Consequently, the successful completion of a transaction and its quality can be determined by applying indicators. We have developed an approach for the systematic derivation of quality indicators. By comparing actual values determined in operations with previously defined target values, the organization is able to detect anomalies in the temporal behavior of the value provision. As a result, the value provision can be controlled with appropriate actions. The quality in use of the approach is confirmed in another action research project with an industry partner. In summary, this thesis supports organizations in quantifying the delivered value of an IoT service and controlling the value provision with effective actions. Furthermore, the trust of a consumer in the IoT service provided by an IoT system and in the organization can be maintained and further increased by applying appropriate feedback loops.Item Open Access Data-integrated methods for performance improvement of massively parallel coupled simulations(2022) Totounferoush, Amin; Schulte, Miriam (Prof. Dr.)This thesis presents data-integrated methods to improve the computational performance of partitioned multi-physics simulations, particularly on highly parallel systems. Partitioned methods allow using available single-physic solvers and well-validated numerical methods for multi-physics simulations by decomposing the domain into smaller sub-domains. Each sub-domain is solved by a separate solver and an external library is incorporated to couple the solvers. This significantly reduces the software development cost and enhances flexibility, while it introduces new challenges that must be addressed carefully. These challenges include but are not limited to, efficient data communication between sub-domains, data mapping between not-matching meshes, inter-solver load balancing, and equation coupling. In the current work, inter-solver communication is improved by introducing a two-level communication initialization scheme to the coupling library preCICE. The new method significantly speed-ups the initialization and removes memory bottlenecks of the previous implementation. In addition, a data-driven inter-solver load balancing method is developed to efficiently distribute available computational resources between coupled single-physic solvers. This method employs both regressions and deep neural networks (DNN) for modeling the performance of the solvers and derives and solves an optimization problem to distribute the available CPU and GPU cores among solvers. To accelerate the equation coupling between strongly coupled solvers, a hybrid framework is developed that integrates DNNs and classical solvers. The DNN computes a solution estimation for each time step which is used by classical solvers as a first guess to compute the final solution. To preserve DNN's efficiency during the simulation, a dynamic re-training strategy is introduced that updates the DNN's weights on-the-fly. The cheap but accurate solution estimation by the DNN surrogate solver significantly reduces the number of subsequent classical iterations necessary for solution convergence. Finally, a highly scalable simulation environment is introduced for fluid-structure interaction problems. The environment consists of highly parallel numerical solvers and an efficient and scalable coupling library. This framework is able to efficiently exploit both CPU-only and hybrid CPU-GPU machines. Numerical performance investigations using a complex test case demonstrate a very high parallel efficiency on a large number of CPUs and a significant speed-up due to the GPU acceleration.Item Open Access Large-scale analysis of textual and multivariate data combining machine learning and visualization(2022) Knittel, Johannes; Ertl, Thomas (Prof. Dr.)Item Open Access Deep learning based prediction and visual analytics for temporal environmental data(2022) Harbola, Shubhi; Coors, Volker (Prof. Dr.)The objective of this thesis is to focus on developing Machine Learning methods and their visualisation for environmental data. The presented approaches primarily focus on devising an accurate Machine Learning framework that supports the user in understanding and comparing the model accuracy in relation to essential aspects of the respective parameter selection, trends, time frame, and correlating together with considered meteorological and pollution parameters. Later, this thesis develops approaches for the interactive visualisation of environmental data that are wrapped over the time series prediction as an application. Moreover, these approaches provide an interactive application that supports: 1. a Visual Analytics platform to interact with the sensors data and enhance the representation of the environmental data visually by identifying patterns that mostly go unnoticed in large temporal datasets, 2. a seasonality deduction platform presenting analyses of the results that clearly demonstrate the relationship between these parameters in a combined temporal activities frame, and 3. air quality analyses that successfully discovers spatio-temporal relationships among complex air quality data interactively in different time frames by harnessing the user’s knowledge of factors influencing the past, present, and future behaviour with Machine Learning models' aid. Some of the above pieces of work contribute to the field of Explainable Artificial Intelligence which is an area concerned with the development of methods that help understand, explain and interpret Machine Learning algorithms. In summary, this thesis describes Machine Learning prediction algorithms together with several visualisation approaches for visually analysing the temporal relationships among complex environmental data in different time frames interactively in a robust web platform. The developed interactive visualisation system for environmental data assimilates visual prediction, sensors’ spatial locations, measurements of the parameters, detailed patterns analyses, and change in conditions over time. This provides a new combined approach to the existing visual analytics research. The algorithms developed in this thesis can be used to infer spatio-temporal environmental data, enabling the interactive exploration processes, thus helping manage the cities smartly.Item Open Access Challenges of computational social science analysis with NLP methods(2022) Dayanik, Erenay; Padó, Sebastian (Prof. Dr.)Computational Social Science (CSS) is an emerging research area at the intersection of social science and computer science, where problems of societal relevance can be addressed by novel computational methods. With the recent advances in machine learning and natural language processing as well as the availability of textual data, CSS has opened up to new possibilities, but also methodological challenges. In this thesis, we present a line of work on developing methods and addressing challenges in terms of data annotation and modeling for computational political science and social media analysis, two highly popular and active research areas within CSS. In the first part of the thesis, we focus on a use case from computational political science, namely Discourse Network Analysis (DNA), a framework that aims at analyzing the structures behind complex societal discussions. We investigate how this style of analysis, which is traditionally performed manually, can be automated. We start by providing a requirement analysis outlining a roadmap to decompose the complex DNA task into several conceptually simpler sub-tasks. Then, we introduce NLP models with various configurations to automate two of the sub-tasks given by the requirement analysis, namely claim detection and classification, based on different neural network architectures ranging from unidirectional LSTMs to Transformer based architectures. In the second part of the thesis, we shift our focus to fairness, a central concern in CSS. Our goal in this part of the thesis is to analyze and improve the performances of NLP models used in CSS in terms of fairness and robustness while maintaining their overall performance. With that in mind, we first analyze the above-mentioned claim detection and classification models and propose techniques to improve model fairness and overall performance. After that, we broaden our focus to social media analysis, another highly active subdomain of CSS. Here, we study text classification of the correlated attributes, which pose an important but often overlooked challenge to model fairness. Our last contribution is to discuss the limitations of the current statistical methods applied for bias identification; to propose a multivariate regression based approach; and to show that, through experiments conducted on social media data, it can be used as a complementary method for bias identification and analysis tasks. Overall, our work takes a step towards increasing the understanding of challenges of computational social science. We hope that both political scientists and NLP scholars can make use of the insights from this thesis in their research.Item Open Access Prediction and similarity models for visual analysis of spatiotemporal data(2022) Tkachev, Gleb; Ertl, Thomas (Prof. Dr.)Ever since the early days of computers, their usage have become essential in natural sciences. Whether through simulation, computer-aided observation or data processing, the progress in computer technology have been mirrored by the constant growth in the size of scientific data. Unfortunately, as the data sizes grow, and human capabilities remains constant, it becomes increasingly difficult to analyze and understand the data. Over the last decades, visualization experts have proposed many approaches to address the challenge, but even these methods have their limitations. Luckily, recent advances in the field of Machine Learning can provide the tools needed to overcome the obstacle. Machine learning models are a particularly good fit as they can both benefit from the large amount of data present in the scientific context and allow the visualization system to adapt to the problem at hand. This thesis presents research into how machine learning techniques can be adapted and extended to enable visualization of scientific data. It introduces a diverse set of techniques for analysis of spatiotemporal data, including detection of irregular behavior, self-supervised similarity metrics, automatic selection of visual representations and more. It also discusses the general challenges of applying Machine Learning to Scientific Visualization and how to address them.Item Open Access Analyzing code corpora to improve the correctness and reliability of programs(2021) Patra, Jibesh; Pradel, Michael (Prof. Dr.)Bugs in software are commonplace, challenging, and expensive to deal with. One widely used direction is to use program analyses and reason about software to detect bugs in them. In recent years, the growth of areas like web application development and data analysis has produced large amounts of publicly available source code corpora, primarily written in dynamically typed languages, such as Python and JavaScript. It is challenging to reason about programs written in such languages because of the presence of dynamic features and the lack of statically declared types. This dissertation argues that, to build software developer tools for detecting and understanding bugs, it is worthwhile to analyze code corpora, which can uncover code idioms, runtime information, and natural language constructs such as comments. The dissertation is divided into three corpus-based approaches that support our argument. In the first part, we present static analyses over code corpora to generate new programs, to perform mutations on existing programs, and to generate data for effective training of neural models. We provide empirical evidence that the static analyses can scale to thousands of files and the trained models are useful in finding bugs in code. The second part of this dissertation presents dynamic analyses over code corpora. Our evaluations show that the analyses are effective in uncovering unexpected behaviors when multiple JavaScript libraries are included together and to generate data for training bug-finding neural models. Finally, we show that a corpus-based analysis can be useful for input reduction, which can help developers to find a smaller subset of an input that still triggers the required behavior. We envision that the current dissertation motivates future endeavors in corpus-based analysis to alleviate some of the challenges faced while ensuring the reliability and correctness of software. One direction is to combine data obtained by static and dynamic analyses over code corpora for training. Another direction is to use meta-learning approaches, where a model is trained using data extracted from the code corpora of one language and used for another language.