Universität Stuttgart
Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1
Browse
38 results
Search Results
Item Open Access Concepts and methods for the design, configuration and selection of machine learning solutions in manufacturing(2021) Villanueva Zacarias, Alejandro Gabriel; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)The application of Machine Learning (ML) techniques and methods is common practice in manufacturing companies. They assign teams to the development of ML solutions to support individual use cases. This dissertation refers as ML solution to the set of software components and learning algorithms to deliver a predictive capability based on available use case data, their (hyper) paremeters and technical settings. Currently, development teams face four challenges that complicate the development of ML solutions. First, they lack a formal approach to specify ML solutions that can trace the impact of individual solution components on domain-specific requirements. Second, they lack an approach to document the configurations chosen to build an ML solution, therefore ensuring the reproducibility of the performance obtained. Third, they lack an approach to recommend and select ML solutions that is intuitive for non ML experts. Fourth, they lack a comprehensive sequence of steps that ensures both best practices and the consideration of technical and domain-specific aspects during the development process. Overall, the inability to address these challenges leads to longer development times and higher development costs, as well as less suitable ML solutions that are more difficult to understand and to reuse. This dissertation presents concepts to address these challenges. They are Axiomatic Design for Machine Learning (AD4ML), the ML solution profiling framework and AssistML. AD4ML is a concept for the structured and agile specification of ML solutions. AD4ML establishes clear relationships between domain-specific requirements and concrete software components. AD4ML specifications can thus be validated regarding domain expert requirements before implementation. The ML solution profiling framework employs metadata to document important characteristics of data, technical configurations, and parameter values of software components as well as multiple performance metrics. These metadata constitute the foundations for the reproducibility of ML solutions. AssistML recommends ML solutions for new use cases. AssistML searches among documented ML solutions those that better fulfill the performance preferences of the new use case. The selected solutions are then presented to decision-makers in an intuitive way. Each of these concepts was evaluated and implemented. Combined, these concepts offer development teams a technology-agnostic approach to build ML solutions. The use of these concepts brings multiple benefits, i. e., shorter development times, more efficient development projects, and betterinformed decisions about the development and selection of ML solutions.Item Open Access Improving usability of gaze and voice based text entry systems(2023) Sengupta, Korok; Staab, Steffen (Prof. Dr.)Item Open Access Data-efficient and safe learning with Gaussian processes(2020) Schreiter, Jens; Toussaint, Marc (Prof. Dr. rer. nat.)Data-based modeling techniques enjoy increasing popularity in many areas of science and technology where traditional approaches are limited regarding accuracy and efficiency. When employing machine learning methods to generate models of dynamic system, it is necessary to consider two important issues. Firstly, the data-sampling process should induce an informative and representative set of points to enable high generalization accuracy of the learned models. Secondly, the algorithmic part for efficient model building is essential for applicability, usability, and the quality of the learned predictive model. This thesis deals with both of these aspects for supervised learning problems, where the interaction between them is exploited to realize an exact and powerful modeling. After introducing the non-parametric Bayesian modeling approach with Gaussian processes and basics for transient modeling tasks in the next chapter, we dedicate ourselves to extensions of this probabilistic technique to relevant practical requirements in the subsequent chapter. This chapter provides an overview on existing sparse Gaussian process approximations and propose some novel work to increase efficiency and model selection on particularly large training data sets. For example, our sparse modeling approach enables real-time capable prediction performance and efficient learning with low memory requirements. A comprehensive comparison on various real-world problems confirms the proposed contributions and shows a variety of modeling tasks, where approximate Gaussian processes can be successfully applied. Further experiments provide more insight about the whole learning process, and thus a profound understanding of the presented work. In the fourth chapter, we focus on active learning schemes for safe and information-optimal generation of meaningful data sets. In addition to the exploration behavior of the active learner, the safety issue is considered in our work, since interacting with real systems should not result in damages or even completely destroy it. Here we propose a new model-based active learning framework to solve both tasks simultaneously. As basis for the data-sampling process we employ the presented Gaussian process techniques. Furthermore, we distinguish between static and transient experimental design strategies. Both problems are separately considered in this chapter. Nevertheless, the requirements for each active learning problem are the same. This subdivision into a static and transient setting allows a more problem-specific perspective on the two cases, and thus enables the creation of specially adapted active learning algorithms. Our novel approaches are then investigated for different applications, where a favorable trade-off between safety and exploration is always realized. Theoretical results maintain these evaluations and provide respectable knowledge about the derived model-based active learning schemes. For example, an upper bound for the probability of failure of the presented active learning methods is derived under reasonable assumptions. Finally, the thesis concludes with a summary of the investigated machine learning problems and motivate some future research directions.Item Open Access Models for data-efficient reinforcement learning on real-world applications(2021) Dörr, Andreas; Toussaint, Marc (Prof. Dr.)Large-scale deep Reinforcement Learning is strongly contributing to many recently published success stories of Artificial Intelligence. These techniques enabled computer systems to autonomously learn and master challenging problems, such as playing the game of Go or complex strategy games such as Star-Craft on human levels or above. Naturally, the question arises which problems could be addressed with these Reinforcement Learning technologies in industrial applications. So far, machine learning technologies based on (semi-)supervised learning create the most visible impact in industrial applications. For example, image, video or text understanding are primarily dominated by models trained and derived autonomously from large-scale data sets with modern (deep) machine learning methods. Reinforcement Learning, on the opposite side, however, deals with temporal decision-making problems and is much less commonly found in the industrial context. In these problems, current decisions and actions inevitably influence the outcome and success of a process much further down the road. This work strives to address some of the core problems, which prevent the effective use of Reinforcement Learning in industrial settings. Autonomous learning of new skills is always guided by existing priors that allow for generalization from previous experience. In some scenarios, non-existing or uninformative prior knowledge can be mitigated by vast amounts of experience for a particular task at hand. Typical industrial processes are, however, operated in very restricted, tightly calibrated operating points. Exploring the space of possible actions or changes to the process naively on the search for improved performance tends to be costly or even prohibitively dangerous. Therefore, one reoccurring subject throughout this work is the emergence of priors and model structures that allow for efficient use of all available experience data. A promising direction is Model-Based Reinforcement Learning, which is explored in the first part of this work. This part derives an automatic tuning method for one of themostcommonindustrial control architectures, the PID controller. By leveraging all available data about the system’s behavior in learning a system dynamics model, the derived method can efficiently tune these controllers from scratch. Although we can easily incorporate all data into dynamics models, real systems expose additional problems to the dynamics modeling and learning task. Characteristics such as non-Gaussian noise, latent states, feedback control or non-i.i.d. data regularly prevent using off-the-shelf modeling tools. Therefore, the second part of this work is concerned with the derivation of modeling solutions that are particularly suited for the reinforcement learning problem. Despite the predominant focus on model-based reinforcement learning as a promising, data-efficient learning tool, this work’s final part revisits model assumptions in a separate branch of reinforcement learning algorithms. Again, generalization and, therefore, efficient learning in model-based methods is primarily driven by the incorporated model assumptions (e.g., smooth dynamics), which real, discontinuous processes might heavily violate. To this end, a model-free reinforcement learning is presented that carefully reintroduces prior model structure to facilitate efficient learning without the need for strong dynamic model priors. The methods and solutions proposed in this work are grounded in the challenges experienced when operating with real-world hardware systems. With applications on a humanoid upper-body robot or an autonomous model race car, the proposed methods are demonstrated to successfully model and master their complex behavior.Item Open Access Data-integrated methods for performance improvement of massively parallel coupled simulations(2022) Totounferoush, Amin; Schulte, Miriam (Prof. Dr.)This thesis presents data-integrated methods to improve the computational performance of partitioned multi-physics simulations, particularly on highly parallel systems. Partitioned methods allow using available single-physic solvers and well-validated numerical methods for multi-physics simulations by decomposing the domain into smaller sub-domains. Each sub-domain is solved by a separate solver and an external library is incorporated to couple the solvers. This significantly reduces the software development cost and enhances flexibility, while it introduces new challenges that must be addressed carefully. These challenges include but are not limited to, efficient data communication between sub-domains, data mapping between not-matching meshes, inter-solver load balancing, and equation coupling. In the current work, inter-solver communication is improved by introducing a two-level communication initialization scheme to the coupling library preCICE. The new method significantly speed-ups the initialization and removes memory bottlenecks of the previous implementation. In addition, a data-driven inter-solver load balancing method is developed to efficiently distribute available computational resources between coupled single-physic solvers. This method employs both regressions and deep neural networks (DNN) for modeling the performance of the solvers and derives and solves an optimization problem to distribute the available CPU and GPU cores among solvers. To accelerate the equation coupling between strongly coupled solvers, a hybrid framework is developed that integrates DNNs and classical solvers. The DNN computes a solution estimation for each time step which is used by classical solvers as a first guess to compute the final solution. To preserve DNN's efficiency during the simulation, a dynamic re-training strategy is introduced that updates the DNN's weights on-the-fly. The cheap but accurate solution estimation by the DNN surrogate solver significantly reduces the number of subsequent classical iterations necessary for solution convergence. Finally, a highly scalable simulation environment is introduced for fluid-structure interaction problems. The environment consists of highly parallel numerical solvers and an efficient and scalable coupling library. This framework is able to efficiently exploit both CPU-only and hybrid CPU-GPU machines. Numerical performance investigations using a complex test case demonstrate a very high parallel efficiency on a large number of CPUs and a significant speed-up due to the GPU acceleration.Item Open Access Ansätze für flexible und fehlertolerante modellgetriebene IoT-Anwendungen in dynamischen Umgebungen(2024) Del Gaudio, Daniel; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)Item Open Access Time-sensitive converged networks : a comprehensive architecture approach(2023) Hellmanns, David; Rothermel, Kurt (Prof. Dr. rer. nat.)Item Open Access B-splines on sparse grids for uncertainty quantification(2021) Rehme, Michael F.; Pflüger, Dirk (Prof. Dr.)Item Open Access Time-sensitive traffic and time-triggered mechanisms : traffic planning and analysis(2022) Falk, Jonathan; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c.)In zunehmendem Maße bilden echtzeitfähige Kommunikationsnetzwerke das Rückgrat, das einzelne Komponenten von vernetzten Anwendungen verbindet. Herkömmlicherweise finden sich solche Anwendungen zum Beispiel in Industrieautomationsanlagen oder Fahrzeugbordnetzen. Je mehr sich Software in Infrastruktursysteme einnistet, beispielsweise im Zusammenhang mit Smart Grids oder Connected Driving, desto mehr steigt der Bedarf an Echtzeitkommunikation. In der Folge werden Netzwerke mit zusätzlichen Mechanismen ausgestattet, die durch ein Zusammenspiel von deterministischem Medienzugriff, zeitlicher Koordination, Pfadberechnung und Ressourcenreservierung Garantien, zum Beispiel hinsichtlich Latenz oder Verzögerung, gewährleisten können. Im ersten Teil dieser Arbeit wird daher das Problem behandelt, wie sich ein Verkehrsplan erzeugen lässt, der Echtzeitgarantien für ein gegebenes Netzwerk und eine Menge von Datenströmen sicherstellt. Dazu werden unterschiedliche Ansätze für verschiedene Ausprägungen des Verkehrsplanungsproblems für zeitgetriggerte Datenströme vorgestellt und anhand einer großen Auswahl von synthetischen Szenarien mit prototypischen Implementierungen evaluiert. Diese Ansätze unterscheiden sich unter anderem hinsichtlich der zeitlichen Auflösung und dem Freiheitsgrad bei der Pfadberechnung sowie der Methode, die zur Berechnung des Verkehrsplans genutzt wird. Zwei dieser Ansätze beruhen auf Constraint Programming. Konkret bedeutet das, dass die unterschiedlichen Verkehrsplannungsbedingungen für die Zeitpläne und Pfade von zeitgetriggerten Datenströmen als Integer Linear Programs mit linearen (Un-)Gleichungen formuliert werden. Diese beiden Ansätze werden für die Planung von sogenannten Complemental Flows erweitert. Als Complemental Flows werden hierbei Datenströme bezeichnet, die aus einem zeitgetriggerten Teil und einem sich ergänzenden ereignisgetriggerten Teil bestehen. Weiter werden zwei auf Konfliktgraphen basierende Verfahren zur Verkehrsplanung vorgestellt. Im zuerst vorgestellten Verfahren wird dabei der Konfliktgraph schrittweise aufgebaut und die Verkehrspläne für statische Szenarien werden effizient mit einer Kombination von einem exakten Algorithmus und einer Heuristik berechnet. Als zweites wird der Konfliktgraphansatz für dynamische Szenarien erweitert. Die sich dabei ergebenden zusätzliche Herausforderungen werden identifiziert und Mittel zur Quantifizierung und Steuerung der Dienstgüte bei Verkehrsplanaktualisierungen vorgestellt. Im zweiten Teil dieser Arbeit wird die Analyse von Netzwerkelementen mit zeitgetriggerten Dienstunterbrechungen im Network Calculus-Framework betrachtet. Zeitgetriggerte Dienstunterbrechungen treten beispielsweise in Netzwerkbrücken auf, die die Zeitpläne, die im ersten Teil dieser Arbeit betrachtet wurden, mit Hilfe eines Sperrmechanismus' durchsetzen. Dieser Sperrmechanismus wirkt sich so aus, dass ein Teil des Datenverkehrs für bestimmte Zeiträume nicht weitergeleitet wird. Ein ähnlicher Effekt ergibt sich auch in Netzwerkelementen, die aus Gründen der Energieeffizienz eine Leistungsabschaltung durchführen. Deshalb wird das verallgemeinerte Problem der zeitgetriggerten Dienstunterbrechung, bei dem Netzwerkelemente nach einem festgelegten Zeitplan den Dienst für bestimmte Datenströmen unterbrechen, untersucht. Eine Voraussetzung, um die Latenzgrenzen und den maximal belegten Zwischenspeicher in diesen Szenarien mit Network Calculus zu berechnen, ist zuerst einmal eine formale Beschreibung dieser Systeme. Deshalb werden zwei Grundformen von Netzwerkelementen mit zeitgetriggerter Dienstunterbrechung identifiziert, die jeweils auch beide unterschiedlich in der Analyse behandelt werden müssen. Für diese Grundformen werden zeitvariante und zeitinvariante Servicekurven herausgearbeitet. Eine Servicekurve beschreibt dabei das Weiterleitungsverhalten der untersuchten Netzwerkelemente und kann für die Analyse von zusammengesetzten Systemen mit Network Calculus verwendet werden. Diese Servicekurven werden in Hinblick auf die Unterabschätzung des angebotenen (Weiterleitungs-)Dienstes ausgewertet und Einflussfaktoren sowie Einschränkungen diskutiert.Item Open Access On consistency and distribution in software-defined networking(2020) Kohler, Thomas; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c.)Software-defined Networking (SDN) is an emerging networking paradigm promising flexible programmability and simplified management. Over the last years, SDN has built up huge momentum in academia that has led to huge practical impact through the large-scale adoption of big industrial players like Google, Facebook, and Microsoft driving cloud computing, data center networks, and their interconnection in SDN-based wide-area networks. SDN is a key enabler for high dynamics in terms of network reconfiguration and innovation, allowing the deployment of new network protocols and substantially expanding the networking paradigm by moving applications into the network, both at unprecedented pace and ease. The SDN paradigm is centered around the separation of the data plane from the logically centralized but typically physically distributed control plane that programs the forwarding behaviour of the network devices in the data plane based on a global view. Especially requirements on correctness, scalability, availability, and resiliency raised through practical adoption at scale have put a strong emphasis on consistency and distribution in the SDN paradigm. This thesis addresses various challenges regarding consistency and distribution in Software-defined Networking. More specifically, it focusses and contributes to the research areas of update consistency, flexibility in control plane distribution, and data plane implementation of a distributed application. Reconfiguring an SDN-based network inevitably requires to update the rules that determine the forwarding behaviour of the devices in its data plane. Updating these rules, which are situated on the inherently distributed data plane devices, is an asynchronous process. Hence, packets traversing the network may be processed according to a mixture of new and old rules during the update process. Consequently arising inconsistency effects can severely degrade the network performance and can break stipulated network invariants for instance on connectivity or security. We introduce a general architecture for network management under awareness of expectable update-induced inconsistency effects, which allows for an appropriate selection of an update mechanism and its parameters in order to prevent those effects. We thoroughly analyze update consistency for the case of multicast networks, show crucial particularities and present mechanisms for the prevention and mitigation of multicast-specific inconsistency effects. Observing that on the one hand SDN's separation of control has been deemed rather strict, moving any control ``intelligence'' from the data plane devices to remote controller entities hence increasing control latency while on the other hand the coupling between controller and data plane devices is quite tight hence hindering free distribution of control logic, we present a controller architecture enabling flexible and full-range distribution of network control. The architecture is based on decoupling through an event abstraction and a flexible dissemination scheme for those events based on the content-based publish/subscribe paradigm. This lightweight design allows to push down control logic back onto data plane devices. Thus, we expand SDN's control paradigm and enable the full range from fully decentralized control, over local control still profiting from global view up to fully centralized control. This scheme allows to trade-off scope of state data, consistency semantics and synchronization overhead, control latency, and quality of control decisions. Furthermore, our implementation covers a large set of mechanisms for improving control plane consistency and scalability, such as inherent load-balancing, fast autonomous control decision making, detection of policy conflicts, and a feedback mechanism for data plane updates. In a last area, we focus on the implementation of a distributed application from the domain of message-oriented middleware in the data plane. We implement Complex Event Processing (CEP) on top of programmable network devices employing data plane programming, a recent big trend in SDN, or more specifically, using the P4 language. We discuss challenges entailed in the distributed data plane processing and address aspects of distribution and consistency in particular regarding consistency in stateful data plane programming, where internal state that determines how packets are processed is changed within this very processing, in turn changing the processing of subsequent packets. Since packet processing is executed in parallel on different execution units on the same device sharing the same state data, strong consistency semantics are required in order to ensure application correctness. Enabled by P4's flexible and powerful programming model, our data plane implementation of CEP yields greatly reduced latency and increased throughput. It comprises a compiler that compiles patterns for the detection of complex events specified in our rule specification language to P4 programs, consisting of a state machine and operators that process so-called windows containing historic events.