05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 141
  • Thumbnail Image
    ItemOpen Access
    Automated composition of adaptive pervasive applications in heterogeneous environments
    (2012) Schuhmann, Stephan Andreas; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c.)
    Distributed applications for Pervasive Computing represent a research area of high interest. Configuration processes are needed before the application execution to find a composition of components that provides the required functionality. As dynamic pervasive environments and device failures may yield unavailability of arbitrary components and devices at any time, finding and maintaining such a composition represents a nontrivial task. Obviously, many degrees of decentralization and even completely centralized approaches are possible in the calculation of valid configurations, spanning a wide spectrum of possible solutions. As configuration processes produce latencies which are noticed by the application user as undesired waiting times, configurations have to be calculated as fast as possible. While completely distributed configuration is inevitable in infrastructure-less Ad Hoc scenarios, many realistic Pervasive Computing environments are located in heterogeneous environments, where additional computation power of resource-rich devices can be utilized by centralized approaches. However, in case of strongly heterogeneous pervasive environments including several resource-rich and resource-weak devices, both centralized and decentralized approaches may lead to suboptimal results concerning configuration latencies: While the resource-weak devices may be bottlenecks for decentralized configuration, the centralized approach faces the problem of not utilizing parallelism. Most of the conducted projects in Pervasive Computing only focus on one specific type of environment: Either they concentrate on heterogeneous environments, which rely on additional infrastructure devices, leading to inapplicability in infrastructure-less environments. Or they address homogeneous Ad Hoc environments and treat all involved devices as equal, which leads to suboptimal results in case of present resource-rich devices, as their additional computation power is not exploited. Therefore, in this work we propose an advanced comprehensive adaptive approach that particularly focuses on the efficient support of heterogeneous environments, but is also applicable in infrastructure-less homogeneous scenarios. We provide multiple configuration schemes with different degrees of decentralization for distributed applications, optimized for specific scenarios. Our solution is adaptive in a way that the actual scheme is chosen based on the current system environment and calculates application compositions in a resource-aware efficient manner. This ensures high efficiency even in dynamically changing environments. Beyond this, many typical pervasive environments contain a fixed set of applications and devices that are frequently used. In such scenarios, identical resources are part of subsequent configuration calculations. Thus, the involved devices undergo a quite similar configuration process whenever an application is launched. However, starting the configuration from scratch every time not only consumes a lot of time, but also increases communication overhead and energy consumption of the involved devices. Therefore, our solution integrates the results from previous configurations to reduce the severity of the configuration problem in dynamic scenarios. We prove in prototypical real-world evaluations as well as by simulation and emulation that our comprehensive approach provides efficient automated configuration in the complete spectrum of possible application scenarios. This extensive functionality has not been achieved by related projects yet. Thus, our work supplies a significant contribution towards seamless application configuration in Pervasive Computing.
  • Thumbnail Image
    ItemOpen Access
    Causal models for decision making via integrative inference
    (2017) Geiger, Philipp; Toussaint, Marc (Prof. Dr.)
    Understanding causes and effects is important in many parts of life, especially when decisions have to be made. The systematic inference of causal models remains a challenge though. In this thesis, we study (1) "approximative" and "integrative" inference of causal models and (2) causal models as a basis for decision making in complex systems. By "integrative" here we mean including and combining settings and knowledge beyond the outcome of perfect randomization or pure observation for causal inference, while "approximative" means that the causal model is only constrained but not uniquely identified. As a basis for the study of topics (1) and (2), which are closely related, we first introduce causal models, discuss the meaning of causation and embed the notion of causation into a broader context of other fundamental concepts. Then we begin our main investigation with a focus on topic (1): we consider the problem of causal inference from a non-experimental multivariate time series X, that is, we integrate temporal knowledge. We take the following approach: We assume that X together with some potential hidden common cause - "confounder" - Z forms a first order vector autoregressive (VAR) process with structural transition matrix A. Then we examine under which conditions the most important parts of A are identifiable or approximately identifiable from only X, in spite of the effects of Z. Essentially, sufficient conditions are (a) non-Gaussian, independent noise or (b) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (a) and (b), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X. Still focusing on topic (1) but already including elements of topic (2), we consider the problem of approximate inference of the causal effect of a variable X on a variable Y in i.i.d. settings "between" randomized experiments and observational studies. Our approach is to first derive approximations (upper/lower bounds) on the causal effect, in dependence on bounds on (hidden) confounding. Then we discuss several scenarios where knowledge or beliefs can be integrated that in fact imply bounds on confounding. One example is about decision making in advertisement, where knowledge on partial compliance with guidelines can be integrated. Then, concentrating on topic (2), we study decision making problems that arise in cloud computing, a computing paradigm and business model that involves complex technical and economical systems and interactions. More specifically, we consider the following two problems: debugging and control of computing systems with the help of sandbox experiments, and prediction of the cost of "spot" resources for decision making of cloud clients. We first establish two theoretical results on approximate counterfactuals and approximate integration of causal knowledge, which we then apply to the two problems in toy scenarios.
  • Thumbnail Image
    ItemOpen Access
    Improving usability of gaze and voice based text entry systems
    (2023) Sengupta, Korok; Staab, Steffen (Prof. Dr.)
  • Thumbnail Image
    ItemOpen Access
    Data-efficient and safe learning with Gaussian processes
    (2020) Schreiter, Jens; Toussaint, Marc (Prof. Dr. rer. nat.)
    Data-based modeling techniques enjoy increasing popularity in many areas of science and technology where traditional approaches are limited regarding accuracy and efficiency. When employing machine learning methods to generate models of dynamic system, it is necessary to consider two important issues. Firstly, the data-sampling process should induce an informative and representative set of points to enable high generalization accuracy of the learned models. Secondly, the algorithmic part for efficient model building is essential for applicability, usability, and the quality of the learned predictive model. This thesis deals with both of these aspects for supervised learning problems, where the interaction between them is exploited to realize an exact and powerful modeling. After introducing the non-parametric Bayesian modeling approach with Gaussian processes and basics for transient modeling tasks in the next chapter, we dedicate ourselves to extensions of this probabilistic technique to relevant practical requirements in the subsequent chapter. This chapter provides an overview on existing sparse Gaussian process approximations and propose some novel work to increase efficiency and model selection on particularly large training data sets. For example, our sparse modeling approach enables real-time capable prediction performance and efficient learning with low memory requirements. A comprehensive comparison on various real-world problems confirms the proposed contributions and shows a variety of modeling tasks, where approximate Gaussian processes can be successfully applied. Further experiments provide more insight about the whole learning process, and thus a profound understanding of the presented work. In the fourth chapter, we focus on active learning schemes for safe and information-optimal generation of meaningful data sets. In addition to the exploration behavior of the active learner, the safety issue is considered in our work, since interacting with real systems should not result in damages or even completely destroy it. Here we propose a new model-based active learning framework to solve both tasks simultaneously. As basis for the data-sampling process we employ the presented Gaussian process techniques. Furthermore, we distinguish between static and transient experimental design strategies. Both problems are separately considered in this chapter. Nevertheless, the requirements for each active learning problem are the same. This subdivision into a static and transient setting allows a more problem-specific perspective on the two cases, and thus enables the creation of specially adapted active learning algorithms. Our novel approaches are then investigated for different applications, where a favorable trade-off between safety and exploration is always realized. Theoretical results maintain these evaluations and provide respectable knowledge about the derived model-based active learning schemes. For example, an upper bound for the probability of failure of the presented active learning methods is derived under reasonable assumptions. Finally, the thesis concludes with a summary of the investigated machine learning problems and motivate some future research directions.
  • Thumbnail Image
    ItemOpen Access
    Verwaltung von zeitbezogenen Daten und Sensordatenströmen
    (2013) Hönle, Nicola Anita Margarete; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
    Sogenannte ortsbezogene Anwendungen interpretieren die räumliche Position des Benutzers als wichtigste Kontextinformation, um ihr Verhalten darauf abzustimmen. Im Rahmen des Nexus-Projekts (SFB627) werden Konzepte zur Unterstützung ortsbezogener Anwendungen erforschtund die Ergebnisse in der sogenannten Nexus-Plattform integriert. Der Benutzerkontext wird aber auch durch die Zeit beeinflusst, da Zeit ein wesentlicher Bestandteil unseres Lebens ist und so gut wie jede Information einen zeitlichen Bezug hat. Die Integration von Zeit bedeutet eine Erweiterung der Nexus-Plattform von der ortsbezogenen Unterstützung hin zu einem allgemeineren kontextbezogenen System. Da die uneingeschränkte Berücksichtigung von Zeit im allgemeinen Fall ein zu großes Themenfeld ist, wurden im Rahmen einer Use-Case-Analyse Anforderungen identifiziert, die besondere Relevanz für das Nexus-Projekt haben. Diese Anforderungen und ihre Umsetzung werden in der vorliegenden Arbeit beschrieben. Die Speicherung von Zeiträumen und Zeitpunkten basiert auf dem GML-Zeitdatentyp, so dass Zeitwerte im Format des ISO-8601-Standards dargestellt werden. Mit diesem Basisdatentyp sind temporale Attribute im Nexus-Datenmodell definierbar. Für die Formulierung von Anfragen wird das neue Prädikat temporalIntersects eingeführt, mit dem eine beliebige Überschneidung eines temporalen Attributs zu einem vorgegebenen Zeitraum angegeben werden kann. Da jedoch die Anfragekriterien nicht im Vorfeld eingeschränkt werden sollen, werden außerdem die minimal notwendigen temporalen Basisprädikate beschrieben, mit denen alle Relationen der Allen-Intervallalgebra formuliert werden können. Die Gültigkeitszeit gibt an, zu welchen Zeiten ein bestimmter Wert den tatsächlichen Realweltzustand korrekt modelliert. Zur Annotation von Daten mit Gültigkeitszeiten, aber auch mit anderen Metadaten, wird ein allgemeines Metadatenkonzept für das Nexus-Datenmodell beschrieben. Mit Metadaten können dann Gültigkeitszeiten von Objekten und Attributen angegeben und so auf einfache Weise Historien von beliebigen Attributen modelliert werden. Interpolationsfunktionen ermöglichen eine genauere und komprimierte Darstellung von sich häufig ändernden Daten mit kontinuierlichen Werteverläufen wie z.B. Sensordatenhistorien. Deshalb werden die Basisdatentypen für Gleitkommazahlen und räumliche Werte so geändert, dass lineare Interpolationsfunktionen für die kontinuierliche Änderung von Werten über die Zeit modellierbar sind. Zur Speicherung wird die Implementierung eines Historienservers beschrieben, der interpolierbare Basisdatentypen verarbeiten kann. Messwerte von Sensoren bestehen meist aus diskreten (Wert, Zeitpunkt)-Tupeln. Da bei der dauerhaften Speicherung von Sensordaten schnell eine große Menge an Daten anfallen kann, ist es sinnvoll, die Daten vorher zu komprimieren. In dieser Arbeit werden sowohl strombasierte als auch konventionell arbeitende Ansätze für eine Komprimierung von Sensordatenströmen vorgestellt: Einfache Approximationsverfahren und die Approximation durch lineare Ausgleichsrechnung sowie Verfahren zur Polygonzugvereinfachung, aber auch ein kartenbasierter Ansatz speziell für Positionsdaten. Zur Klassifikation der Ansätze werden verschiedene Eigenschaften von Komprimierungsalgorithmen vorgestellt. Für die Alterung von komprimierten Sensordaten wird das neue Konzept der Fehlerbeschränktheit bei Alterung eingeführt. Die Algorithmen werden entsprechend klassifiziert und mit GPS-Testdatensätzen von PKW-Fahrten evaluiert. Die gelungene Integration der Zeitaspekte wird anhand dem Messetagebuch, einer Beispielanwendung zur Aufzeichnung und Auswertung von Benutzeraktivitäten, gezeigt. Ein weiteres Anwendungsbeispiel ist der Einsatz des NexusDS-Datenstrommanagementsystems zur Erfassung, Integration und Historisierung von Datenströmen unterschiedlicher Herkunft in einer sogenannten Smart Factory.
  • Thumbnail Image
    ItemOpen Access
    Distributed stream processing in a global sensor grid for scientific simulations
    (2015) Benzing, Andreas; Rothermel, Kurt (Prof. Dr. rer. nat)
    With today's large number of sensors available all around the globe, an enormous amount of measurements has become available for integration into applications. Especially scientific simulations of environmental phenomena can greatly benefit from detailed information about the physical world. The problem with integrating data from sensors to simulations is to automate the monitoring of geographical regions for interesting data and the provision of continuous data streams from identified regions. Current simulation setups use hard coded information about sensors or even manual data transfer using external memory to bring data from sensors to simulations. This solution is very robust, but adding new sensors to a simulation requires manual setup of the sensor interaction and changing the source code of the simulation, therefore incurring extremely high cost. Manual transmission allows an operator to drop obvious outliers but prohibits real-time operation due to the long delay between measurement and simulation. For more generic applications that operate on sensor data, these problems have been partially solved by approaches that decouple the sensing from the application, thereby allowing for the automation of the sensing process. However, these solutions focus on small scale wireless sensor networks rather than the global scale and therefore optimize for the lifetime of these networks instead of providing high-resolution data streams. In order to provide sensor data for scientific simulations, two tasks are required: i) continuous monitoring of sensors to trigger simulations and ii) high-resolution measurement streams of the simulated area during the simulation. Since a simulation is not aware of the deployed sensors, the sensing interface must work without an explicit specification of individual sensors. Instead, the interface must work only on the geographical region, sensor type, and the resolution used by the simulation. The challenges in these tasks are to efficiently identify relevant sensors from the large number of sources around the globe, to detect when the current measurements are of relevance, and to scale data stream distribution to a potentially large number of simulations. Furthermore, the process must adapt to complex network structures and dynamic network conditions as found in the Internet. The Global Sensor Grid (GSG) presented in this thesis attempts to close this gap by approaching three core problems: First, a distributed aggregation scheme has been developed which allows for the monitoring of geographic areas for sensor data of interest. The reuse of partial aggregates thereby ensures highly efficient operation and alleviates the sensor sources from individually providing numerous clients with measurements. Second, the distribution of data streams at different resolutions is achieved by using a network of brokers which preprocess raw measurements to provide the requested data. The load of high-resolution streams is thereby spread across all brokers in the GSG to achieve scalability. Third, the network usage is actively minimized by adapting to the structure of the underlying network. This optimization enables the reduction of redundant data transfers on physical links and a dynamic modification of the data streams to react to changing load situations.
  • Thumbnail Image
    ItemOpen Access
    Models for data-efficient reinforcement learning on real-world applications
    (2021) Dörr, Andreas; Toussaint, Marc (Prof. Dr.)
    Large-scale deep Reinforcement Learning is strongly contributing to many recently published success stories of Artificial Intelligence. These techniques enabled computer systems to autonomously learn and master challenging problems, such as playing the game of Go or complex strategy games such as Star-Craft on human levels or above. Naturally, the question arises which problems could be addressed with these Reinforcement Learning technologies in industrial applications. So far, machine learning technologies based on (semi-)supervised learning create the most visible impact in industrial applications. For example, image, video or text understanding are primarily dominated by models trained and derived autonomously from large-scale data sets with modern (deep) machine learning methods. Reinforcement Learning, on the opposite side, however, deals with temporal decision-making problems and is much less commonly found in the industrial context. In these problems, current decisions and actions inevitably influence the outcome and success of a process much further down the road. This work strives to address some of the core problems, which prevent the effective use of Reinforcement Learning in industrial settings. Autonomous learning of new skills is always guided by existing priors that allow for generalization from previous experience. In some scenarios, non-existing or uninformative prior knowledge can be mitigated by vast amounts of experience for a particular task at hand. Typical industrial processes are, however, operated in very restricted, tightly calibrated operating points. Exploring the space of possible actions or changes to the process naively on the search for improved performance tends to be costly or even prohibitively dangerous. Therefore, one reoccurring subject throughout this work is the emergence of priors and model structures that allow for efficient use of all available experience data. A promising direction is Model-Based Reinforcement Learning, which is explored in the first part of this work. This part derives an automatic tuning method for one of themostcommonindustrial control architectures, the PID controller. By leveraging all available data about the system’s behavior in learning a system dynamics model, the derived method can efficiently tune these controllers from scratch. Although we can easily incorporate all data into dynamics models, real systems expose additional problems to the dynamics modeling and learning task. Characteristics such as non-Gaussian noise, latent states, feedback control or non-i.i.d. data regularly prevent using off-the-shelf modeling tools. Therefore, the second part of this work is concerned with the derivation of modeling solutions that are particularly suited for the reinforcement learning problem. Despite the predominant focus on model-based reinforcement learning as a promising, data-efficient learning tool, this work’s final part revisits model assumptions in a separate branch of reinforcement learning algorithms. Again, generalization and, therefore, efficient learning in model-based methods is primarily driven by the incorporated model assumptions (e.g., smooth dynamics), which real, discontinuous processes might heavily violate. To this end, a model-free reinforcement learning is presented that carefully reintroduces prior model structure to facilitate efficient learning without the need for strong dynamic model priors. The methods and solutions proposed in this work are grounded in the challenges experienced when operating with real-world hardware systems. With applications on a humanoid upper-body robot or an autonomous model race car, the proposed methods are demonstrated to successfully model and master their complex behavior.
  • Thumbnail Image
    ItemOpen Access
    Supporting multi-tenancy in Relational Database Management Systems for OLTP-style software as a service applications
    (2015) Schiller, Oliver; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
    The consolidation of multiple tenants onto a single relational database management system (RDBMS) instance, commonly referred to as multi-tenancy, turned out being beneficial since it supports improving the profit margin of the provider and allows lowering service fees, by what the service attracts more tenants. So far, existing solutions create the required multi-tenancy support on top of a traditional RDBMS implementation, i. e., they implement data isolation between tenants, per-tenant customization and further tenant-centric data management features in application logic. This is complex, error-prone and often reimplements efforts the RDBMS already offers. Moreover, this approach disables some optimization opportunities in the RDBMS and represents a conceptual misstep with Separation of Concerns in mind. For the points mentioned, an RDBMS that provides support for the development and operation of a multi-tenant software as a service (SaaS) offering is compelling. In this thesis, we contribute to a multi-tenant RDBMS for OLTP-style SaaS applications by extending a traditional disk-oriented RDBMS architecture with multi-tenancy support. For this purpose, we primarily extend an RDBMS by introducing tenants as first-class database objects and establishing tenant contexts to isolate tenants logically. Using these extensions, we address tenant-aware schema management, for which we present a schema inheritance concept that is tailored to the needs of multi-tenant SaaS applications. Thereafter, we evaluate different storage concepts to store a tenant’s tuples with respect to their scalability. Next, we contribute an architecture of a multi-tenant RDBMS cluster for OLTP-style SaaS applications. At that, we focus on a partitioning solution which is aligned to tenants and allows obtaining independently manageable pieces. To balance load in the proposed cluster architecture, we present a live database migration approach, whose design favors low migration overhead and provides minimal interruption of service.
  • Thumbnail Image
    ItemOpen Access
    Emulation von Rechnernetzen zur Leistungsanalyse von verteilten Anwendungen und Netzprotokollen
    (2005) Herrscher, Daniel J.; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c)
    Um die Leistung von verteilten Anwendungen und Netzprotokollen in Abhängigkeit von den Eigenschaften der verwendeten Rechnernetze zu analysieren, wird eine Testumgebung benötigt, die Netzeigenschaften zuverlässig nachbilden ("emulieren") kann. Eine solche Testumgebung wird Emulationssystem genannt. Bisher existierende Emulationssysteme sind aufgrund ihrer Architektur entweder nur für sehr kleine Szenarien geeignet, oder sie können nur unabhängige Netzverbindungen nachbilden, und schließen damit alle Netztechnologien mit gemeinsamen Medien aus. In dieser Arbeit werden zunächst verschiedene Architekturvarianten für die Realisierung eines Emulationssystems vorgestellt und bewertet. Für die Variante mit zentraler Steuerung und verteilten Emulationswerkzeugen wird dann detailliert die Funktionalität eines Emulationssystems mit seinen wesentlichen Komponenten beschrieben. Das in dieser Arbeit entwickelte Emulationsverfahren greift auf der logischen Ebene der Sicherungsschicht in den Kommunikationsstapel ein. Auf dieser Ebene werden die beiden Basiseffekte Rahmenverlust und Verzögerung durch verteilte Emulationswerkzeuge nachgebildet. Alle anderen Netzeigenschaften können auf diese Basiseffekte zurückgeführt werden. Um Netztechnologien mit gemeinsamen Medien durch verteilte Werkzeuge nachbilden zu können, wird zusätzlich das Konzept des virtuellen Trägersignals eingeführt. Hierbei werden die Eigenschaften eines Rundsendemediums nachgebildet, indem kooperative Emulationswerkzeuge Rundsendungen zur Signalisierung eines Trägersignals benutzen. Somit kann jedeWerkzeuginstanz lokal ein aktuelles Modell des emulierten gemeinsamen Mediums halten. Auf dieser Basis kann auch das Verhalten von Medienzugriffsprotokollen nachgebildet werden. Die Arbeit deckt auch die wesentlichen Realisierungsaspekte eines Emulationssytems ab. Mit ausführlichen Messungen wird gezeigt, dass das entwickelte System für die Nachbildung von Netzszenarien sehr gut geeignet ist, selbst wenn die nachzubildenden Parameter sich dynamisch ändern. Die entwickelten Werkzeuge sind in der Lage, Netzeigenschaften in einem weiten Parameterbereich realistisch nachzubilden. Mit diesem System steht nun eine ideale Testumgebung für Leistungsmessungen von verteilten Anwendungen und Netzprotokollen in Abhängigkeit von Netzeigenschaften zur Verfügung.
  • Thumbnail Image
    ItemOpen Access
    Data-integrated methods for performance improvement of massively parallel coupled simulations
    (2022) Totounferoush, Amin; Schulte, Miriam (Prof. Dr.)
    This thesis presents data-integrated methods to improve the computational performance of partitioned multi-physics simulations, particularly on highly parallel systems. Partitioned methods allow using available single-physic solvers and well-validated numerical methods for multi-physics simulations by decomposing the domain into smaller sub-domains. Each sub-domain is solved by a separate solver and an external library is incorporated to couple the solvers. This significantly reduces the software development cost and enhances flexibility, while it introduces new challenges that must be addressed carefully. These challenges include but are not limited to, efficient data communication between sub-domains, data mapping between not-matching meshes, inter-solver load balancing, and equation coupling. In the current work, inter-solver communication is improved by introducing a two-level communication initialization scheme to the coupling library preCICE. The new method significantly speed-ups the initialization and removes memory bottlenecks of the previous implementation. In addition, a data-driven inter-solver load balancing method is developed to efficiently distribute available computational resources between coupled single-physic solvers. This method employs both regressions and deep neural networks (DNN) for modeling the performance of the solvers and derives and solves an optimization problem to distribute the available CPU and GPU cores among solvers. To accelerate the equation coupling between strongly coupled solvers, a hybrid framework is developed that integrates DNNs and classical solvers. The DNN computes a solution estimation for each time step which is used by classical solvers as a first guess to compute the final solution. To preserve DNN's efficiency during the simulation, a dynamic re-training strategy is introduced that updates the DNN's weights on-the-fly. The cheap but accurate solution estimation by the DNN surrogate solver significantly reduces the number of subsequent classical iterations necessary for solution convergence. Finally, a highly scalable simulation environment is introduced for fluid-structure interaction problems. The environment consists of highly parallel numerical solvers and an efficient and scalable coupling library. This framework is able to efficiently exploit both CPU-only and hybrid CPU-GPU machines. Numerical performance investigations using a complex test case demonstrate a very high parallel efficiency on a large number of CPUs and a significant speed-up due to the GPU acceleration.