Universität Stuttgart

Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1

Browse

Search Results

Now showing 1 - 10 of 296
  • Thumbnail Image
    ItemOpen Access
    Generic templates for monitoring agents
    (2018) Weise, Marc
    This thesis presents an agent-centric approach for monitoring IT resources, which enables the execution of preprocessing and aggregation steps directly on the target systems in order to limit data transfers to a central server and allow a local event detection and treatment. To keep the agent behavior definition as simple as possible, an extendable template model is introduced which can be used to define Monitoring Pipelines by chaining individual processing steps. Furthermore this work demonstrates how a graphical editor can be implemented which also allows non-experts in the field of monitoring to create and modify Monitoring Templates.
  • Thumbnail Image
    ItemOpen Access
    Personenbezogene Daten im Data Lake
    (2018) Ebinger, Felix
    Big-Data-Analysen bieten Wettbewerbsvorteile, ermöglichen Innovationen und können zu einer höheren Qualität von Produkten oder Serviceleistungen beitragen. Insbesondere die Analyse von Kundendaten und des Kundenverhaltens eröffnet vielfältige Möglichkeiten, um dem Kunden auf ihn zugeschnittene Angebote zu unterbreiten und um so zu höheren Umsätzen und zu einer höheren Kundenzufriedenheit beizutragen. Für die dafür benötigten Daten werden geeignete Speichersysteme benötigt. Ein solches System stellt der Data Lake dar. Neben der gut skalierenden und günstigen Speicherung von Daten ist auch die Auswertung der Daten mittels explorativer Analysen bereits im Design angelegt. Gleichzeitig steht aber auch der Schutz, genauer der fehlende Schutz der Privatsphäre, des Einzelnen bei Big Data Verarbeitungen im Mittelpunkt der öffentlichen Aufmerksamkeit und Kritik. Insbesondere wird vor dem so entstehenden „gläsernen Menschen“ und den daraus resultierenden gesellschaftlichen Folgen gewarnt. Die sich daraus ergebenden Fragen, in welchem Umfang und auf welche Art personenbezogene Daten verarbeitet werden dürfen, bedürfen, neben einer ethisch-moralischen, vor allem einer rechtlichen Antwort. Die europäische Datenschutzgrundverordnung stellt hierzu den rechtlichen Rahmen dar, in dem personenbezogene Daten verarbeitet werden dürfen. In dieser Arbeit werden die gesetzlichen Anforderungen mit dem Konzept des Data Lakes abgeglichen und es wird aufgezeigt, wo Herausforderungen beim Design und bei der Implementierung eines Data Lakes entstehen (z.B. Transparenz, Zweckbindung, Recht auf Löschung). Zudem werden Lösungsansätze für diese Herausforderungen entwickelt und vorgestellt. Aus den einzelnen Lösungsansätzen werden zwei Lösungskonzepte für einige der identifizierten Herausforderungen entwickelt. Eines der Konzepte, ein Metadaten-Modell, wird dabei prototypisch umgesetzt und anhand von Use Cases beispielhaft getestet.
  • Thumbnail Image
    ItemOpen Access
    Scheduling with uncertainty for Time-Sensitive Networking using robust optimization techniques and integer linear programming
    (2024) Bauer, Florian
    Application services depend on the network to guarantee reliability, which is critical for safety and correct operation. Time-Sensitive Networking is a technology for reliable real-time communication of time-sensitive applications. While many schedulers exist that provide reliability for wired Time-Sensitive Networks (TSN) with the assumption of deterministic packet delays, scheduling for wireless TSN with uncertain packet delays has received significantly less attention. This work leverages the methodology of Robust Optimization (RO) to propose a robust scheduling approach that ensures provable reliability for both wired and wireless TSN. An uncertainty set defines the range of possible values, ensuring that the schedule remains feasible under all possible realizations within this set. As uncertainty sets are a key component in RO, we introduce methods to compute boxed and polytope uncertainty sets containing possible packet delays based on a set of given reliability requirements. A scheduler is deemed robust if it satisfies the given reliability constraints for all possible packet delays within the computed uncertainty set. Although robustness can be achieved through strict isolation and conservative filtering of packets, we demonstrate that several limitations prevent known robust schedulers from fully exploiting arbitrary uncertainty set shapes. As certain problem instances are unsolvable using simple boxed uncertainty sets, we indicate the need for schedulers that can utilize complex shapes of uncertainty sets rather than boxes. In response to this challenge, we introduce Uncertain No-Wait Packet Scheduling (UNWPS), a scheduler capable of computing robust schedules, and prove that UNWPS is robust against arbitrary upper-bounded boxed and polytope uncertainty sets. We assess the influence of uncertainty sets on the quality of the resulting UNWPS schedules, compare their performances to the performance of other robust scheduling approaches across various exemplary TSN networks and message stream configurations and carry out simulations conducted using the DetCom simulation framework to validate the robustness of UNWPS empirically.
  • Thumbnail Image
    ItemOpen Access
    Development and analysis of a window manager concept for consolidated 3D rendering on an embedded platform
    (2015) Zhao, Han
    Nowadays with the information technology rapidly developing, an increasing number of 2D and 3D graphics are used in automotive displaying systems, to provide vehicle information, driving assistance, etc. With the demand of 3D models interacting with each other, an implementation should have a 3D compositing capability. However, traditional 2D compositing implementations are not capable of 3D models compositing tasks. In order to composite 3D graphics on embedded platform, the 3D compositing implementation is necessary. Therefore, a concept of window manager is developed aiming to composite 3D graphics with an optimized efficiency for embedded platform. Specially for automotive platforms, a virtualization is made to unify multiple Electronic Control Units (ECUs) into one single ECU platform. On this platform, a server and multiple clients are implemented with dedicated Virtual Machines (VMs). The server is in charge of rendering tasks requested from clients. Based on this, a 3D compositing concept is implemented. It handles efficiently the multiple 3D applications situation using a method of off-screen rendering. A server-side virtualization is also implemented by replacing certain client-side commands during commands forwarding. With this virtualization implementation, multiple applications run simultaneously with accessing single 3D GPU only. Moreover, due to this implementation, monolithic rendering operations affecting all applications, e.g. uniform lighting operation, are possible.
  • Thumbnail Image
    ItemOpen Access
    Location-history partitioning algorithms for privacy in non-trusted geo-social networks
    (2017) Zhang, Qi
    Due to the rapid development of mobile device technology in the past couple of decades, mobile devices are playing a more and more important part in our daily life. Many mobile services along with mobile devices have integrated into our activities or even reshaped our lifestyle. Location services are one of the main mobile services being widely used. One can share ones location to get to know information nearby, and it could be shared with friends in social media. New mobile applications are showing up at an amazing speed, together with that, the usage of location data is a privacy threat. If too much information is shared, the user's movements could be predicted; highly privacy sensitive locations, such as home location of user, could be leaked. Many location based applications, such as geo-social networks (GSN), use Location Servers to store user position information. However, since GSN providers may not be fully trustworthy or may not be able to protect user data, users may not want to store all of their privacy-sensitive location information with a single provider. Therefore, this thesis focuses on developing and evaluating methods to partition location data among multiple servers as similarly attempted in other approaches. In this thesis, we try to partition location data to achieve privacy protection. We have studied a range of mobility modeling methods that consider the different fundamental dimensions of the location data, i.e., spatial, temporal, and semantic, as well as their combinations. Inspired by those methods, we have proposed partitioning methods to increase privacy protection. Furthermore, a couple of other partition methods, which are combinations of spatial, temporal and semantic, are implemented. Eventually all the partition methods are evaluated with our data.
  • Thumbnail Image
    ItemOpen Access
    Hybrid parallel computing beyond MPI & OpenMP - introducing PGAS & StarSs
    (2011) Sethi, Muhammad Wahaj
    High-performance architectures are becoming more and more complex with the passage of time. These large scale, heterogeneous architectures and multi-core system are difficult to program. New programming models are required to make expression of parallelism easier, while keeping productivity of the developer higher. Partition Global Address-space (PGAS) languages such as UPC appeared to augment developer’s productivity for distributed memory systems. UPC provides a simpler, shared memory-like model with a user control over data layout. But it is developer’s responsibility to take care of the data locality, by using appropriate data layouts. SMPSs/StarSs programming model tries to simplify the parallel programming on multicore architectures. It offers task level parallelism, where dependencies among the tasks are determined at the run time. In addition, runtime take cares of the data locality, while scheduling tasks. Hence, providing two-folds improvement in productivity; first, saving developer’s time by using automatic dependency detection, instead of hard coding them. Second, save cache optimization time, as runtime take cares of data locality. The purpose of this thesis is to use the PGAS programming model e.g. UPC for different nodes with the shared memory task based parallelization model i.e. StarSs to take the advantage of the multi core systems and contrast this approach to the legacy MPI and OpenMP combination. Performance as well as programmability is considered in the evaluation. The combination UPC + SMPSs, results in approximately the same execution time as MPI and OpenMP. The current lack of features such as multi-dimensional data distribution or virtual topologies in UPC, make the hybrid UPC + SMPSs/StarSs programming model less programmable than MPI + OpenMP for the application studied in this thesis.
  • Thumbnail Image
    ItemOpen Access
    Large-scale data mining analytics based on MapReduce
    (2014) Ranjan, Sunny
    In this work, we search for possible approaches to large-scale data mining analytics. We perform an exploration about the existing MapReduce and other MapReduce-like frameworks for distributed data processing and the distributed file systems for distributed data storage. We study in detail about Hadoop Distributed File System (HDFS) and Hadoop MapReduce software framework. We analyse the benefits of newer version of Hadoop software framework which provides better scalability solution by segregating the cluster resource management task from MapReduce framework. This version is called YARN and is very flexible in supporting various kinds of distributed data processing other than batchmode processing of MapReduce. We also looked into various implementations of data mining algorithms based on MapReduce to derive a comprehensive concept about developing such algorithms. We also looked for various tools that provided MapRedcue based scalable data mining algorithms. We could only find Mahout as a tool specially based on Hadoop MapReduce. But the tool developer team decided to stop using Hadoop MapReduce and to use instead Apache Spark as the underlying execution engine. WEKA also has a very small subset of data mining algorithms implemented using MapReduce which is not properly maintained and supported by the developer team. Subsequently, we found out that Apache Spark, apart from providing an optimised and a faster execution engine for distributed processing also provided an accompanying library for machine learning algorithms. This library is called Machine Learning library (MLlib). Apache Spark claimed that it is much faster than Hadoop MapReduce as it exploits the advantages of in-memory computations which is particularly more beneficial for iterative workloads in case of data mining. Spark is designed to work on variety of clusters: YARN being one of them. It is designed to process the Hadoop data. We selected to perform a particular data mining task: decision tree learning based classification and regression data mining. We stored properly labelled training data for predictive mining tasks in HDFS. We set up a YARN cluster and run Spark's MLlib applications on this cluster. These applications use the cluster managing capabilities of YARN and the distributed execution framework of Spark core services. We performed several experiments to measure the performance gains, speed-up and scaleup of implementations of decision tree learning algorithms in Spark's MLlib. We found out much better than expected results for our experiments. We achieved a much higher than ideal speed-up when we increased the number of nodes. The scale-up is also very excellent. There is a significant decrease in run-time for training decision tree models by increasing the number of nodes. This demonstrates that Spark's MLlib decision tree learning algorithms for classification and regression analysis are highly scalable.
  • Thumbnail Image
    ItemOpen Access
    A deep learning approach for large-scale groundwater heat pump temperature prediction
    (2022) Scheurer, Stefania
    Heating and cooling buildings is one of the most energy-intensive aspects of modern life. To minimize the impact on global warming and decelerate climate change, more efficient and carbon emission-mitigating technologies such as openloop groundwater heat pumps (GWHP) for heating and cooling buildings are being used and quickly adopted. Nowadays, in order to guarantee their optimal use and prevent negative interactions, city planners need to optimize their placement in the urban landscape. This optimization process requires fast models that simulate the effect of a GWHP on the groundwater temperature. Considering a large domain with multiple GWHPs, this work introduces a framework for the groundwater temperature prediction. While using a learned local surrogate model, a convolutional neural network, to predict the local temperature field around every single GWHP, a physics-informed neural network (PINN) is employed afterwards to correct the global initial solution of stitched together local predictions. As the violations of the physical laws described by the underlying partial differential equation(s) are spatially unevenly distributed, two different methods for drawing sampling points, on the basis of which the training of the PINN to correct the global initial solution takes place, are investigated and compared. This work shows that it is possible for a PINN to correct the global initial solution of stitched together local predictions in a domain with multiple GWHPs. However, there are still opportunities to improve the quality and decrease the computational time of the presented framework. The best method for drawing sampling points depends on the scenario and the placement of the GWHPs. Thus, no general statement can be made, which of the two methods is more suitable. This work provides a good basis for further investigation of the presented framework.
  • Thumbnail Image
    ItemOpen Access
    Entwicklung von Algorithmen zur Planung der Wege von fahrerlosen Transportsystemen in einem Logistik-Warehouse
    (2017) Braunschweiger, Dirk
    In der Automobilindustrie ist in den letzten Jahren die Anforderungen an die Logistik-Warenhäuser gestiegen. Die steigende Individualisierung von Fahrzeugen ist der Grund dafür. Um die Anforderungen erfüllen zu können, werden in modernen Logistik-Warenhäusern die Waren durch fahrerlose Transportfahrzeuge transportiert. Es existieren viele Algorithmen zur Berechnung des kürzesten Weges für einzelne Fahrzeuge. Diese können in Warenhäusern mit vielen Fahrzeugen nicht eingesetzt werden, da es zu Staus, Deadlocks oder Kollisionen kommen kann. Es existieren bereits Algorithmen, die versuchen diese Probleme zu lösen. Wenige dieser Algorithmen wurden bisher auf die Praxistauglichkeit getestet. Die Algorithmen werden oft mit wenigen Fahrzeugen oder auf kleinen Straßennetzen getestet. Diese Arbeit stellt Algorithmen zur Berechnung von Wegen für mehrere Fahrzeuge vor und analysiert diese anschließend. Die Performanz der Algorithmen wird anhand realer Szenarien aus der Automobilindustrie gemessen. Dafür werden zuerst Straßennetze basierend auf echten Lagerhallen erstellt. Anschließend wird in verschiedenen Benchmarks die Performanz ausgewählter Algorithmen miteinander verglichen. Basierend auf den besten Algorithmen wird ein neuer Algorithmus entwickelt und mit bestehenden Algorithmen verglichen. Der neue Algorithmus benötigt weniger Rechenzeit und berechnet kürzere Wege. Die Ergebnisse werden abschließend mithilfe einer Simulations-Software validiert.
  • Thumbnail Image
    ItemOpen Access
    Measurement of the quality of structured and unstructured data accumulating in the product life cycle in a data quality dashboard
    (2017) Chellathurai Saroja, Shalini
    This thesis provides an overview on existing data quality metrics for structured and unstructured data as well as on the existing data quality dashboards for measuring the quality of structured and unstructured data. Open research questions for interpreting the data quality are discussed. The metrics percentage of null values, percentage of duplicate values and percentage of non-domain values were selected and implemented as REST based web services. Furthermore, a web application was developed to enable (1) upload of the data file for which data quality shall be assessed from two standard formats JSON and CSV and (2) flexible integration of various data quality metrics. The latter is enabled by using an interface. To illustrate the functionality of this interface, the metric percentage of spelling mistakes provided by the supervisor of the thesis is integrated with the web application. The data quality is indicated as percentage in the range from 0 to 100 as well as encoded with colors for the whole dataset and for each column. Donut chart or pie chart visualizations are implemented for the chosen data quality metrics. The implemented web application and metrics were evaluated with the example datasets for data accumulating in the product life cycle as provided by the supervisor. Finally, the dashboard is compared with existing data quality dashboards and the results are tabulated.