05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 135
  • Thumbnail Image
    ItemOpen Access
    Distributed Deep Reinforcement Learning for Learn-to-optimize
    (2023) Mayer, Paul
    In the context of increasingly complex applications, e.g., robust performance tuning in Integrated Circuit Design, conventional optimization methods have difficulties in achieving satisfactory results while keeping to a limited time budget. Therefore, learning optimization algorithms becomes more and more interesting, replacing the established way of hand-crafting or tweaking algorithms. Learned algorithms reduce the amount of assumptions and expert knowledge necessary to create state-of-the-art solvers by decreasing the need of hand-crafting heuristics and hyper-parameter tuning. First advancements using Reinforcement Learning have shown great success in outperforming typical zeroth- and first-order optimization algorithms, especially with respect to generalization capabilities. However, training still is very time consuming. Especially challenging is training models on functions with free parameters. Changing these parameters (that could represent, e.g., conditions in a real world example) affects the underlying objective function. Robust solutions therefore depend on thorough sampling, which tends to be the bottleneck considering time consumption. In this thesis we identified the runtime bottleneck of the Reinforcement Learning Algorithm and were able to decrease runtime drastically by distributing data collection. Additionally, we studied the effects of combining sampling strategies in regards to generalization capabilities of the learned algorithm.
  • Thumbnail Image
    ItemOpen Access
    Second-order projection-based mapping methods for coupled multi-physics simulations
    (2022) Ariguib, Boshra
    Data mapping describes the exchange of variables between different, usually non-matching grids for storing data. As different physics require different physical constraints, so do different simulation require different mesh properties. This makes data mapping a crucial part when coupling single-physics simulations into a multi-physics simulation. However, the tradeoff for the available computationally efficient methods is usually low accuracy order. Such a method is the nearest-neighbor mapping method, which relies on a computationally inexpensive mapping algorithm and shows a first-order accuracy, as it is based on a constant interpolation. A second-order projection-based mapping method nearest-neighbor-gradient aims to improve the accuracy order of the nearest-neighbor mapping, while preserving the low computational costs. This is achieved through the extension of the existing method by considering additional gradient data information and applying a Hermite interpolation, in order to balance out both the computational efficiency and the accuracy of the mapping. In this thesis, we implemented this method by extending the coupling library preCICE, which uses state-of-the-art algorithms for coupling partitioned multi-physics simulations in a black-box manner. We confirmed the theoretical observations of the expected second-order accuracy of the method and we found that the method shows the best convergence order by contrast with the existing mapping methods, including radial basis function mappings. It also performs just as well as the existing projection-based method in terms of computational cost and outperforms the radial basis function mapping in respect of runtime costs.
  • Thumbnail Image
    ItemOpen Access
    A data plane interface for resource-constrained microcontrollers in time-sensitive networking
    (2026) Kupka, Bastian
    Time-Sensitive Networking (TSN) enables deterministic Ethernet communication through coordinated transmission control. While there are dedicated TSN switches and network interface cards, they are rarely available on resource-constrained microcontrollers. In particular, transmit-time-based scheduling approaches such as the Earliest TxTime First (ETF) queuing discipline are typically not supported on embedded platforms. This thesis investigates the feasibility of implementing ETF-like transmit-time scheduling on a microcontroller using Zephyr RTOS. The implementation targets the NXP i.MX RT1062, which provides a precise PTP hardware clock and compare interrupt mechanism but lacks dedicated hardware support for traffic shaping. The proposed solution integrates transmit-time scheduling into the Zephyr networking stack by combining hardware-triggered interrupts with software-based buffer management inside the network driver. Scheduled frames are prepared in advance and transmitted using a PTP compare event to trigger the transmission routine close to the target time, followed by a short busy-wait phase to improve precision. Best-effort traffic is handled through a driver-level guard band to avoid interference with scheduled transmissions. Experimental evaluation shows that, for periodic traffic with fixed inter-packet gaps, the implementation achieves a bounded transmission window of approximately 3 µs. For variable inter-packet gaps, a larger timing spread is observed. The results demonstrate that ETF-like transmit-time scheduling can be realized on a low-cost microcontroller for certain traffic patterns by leveraging existing PTP hardware features.
  • Thumbnail Image
    ItemOpen Access
    Dynamic workload balancing for heterogeneous systems
    (2020) Strack, Alexander
    During the last two decades, GPUs developed into powerful and massively parallel processors. That rose the attention of scientist who started using GPUs for large scale scientific computing, e.g. simulations. However, the architecture of GPUs is different from CPUs. Furthermore, graphic processors have their now fast access memory. Computing in a heterogeneous system consisting of a CPU and multiple GPUs has various challenges. In this work, we focus on how to distribute the load among the different components. We consider an iterative load that can be redistributed after each iteration. The goal of our scheduling methods is to minimise the computation time of the next iteration by estimating the performance of each component. After a short introduction to load balancing, we specify the iterative workload scenario and differentiate it from the typical task-based scenario often found in the literature. Then, we show the basics of GPU programming with the help of NVIDIAs CUDA API. Furthermore, we introduce the different kernels we use for our test and derive multiple schedulers. Our dynamic schedulers use the time each component took to compute its assigned workload in the last iteration as a basis of the performance estimation. After investigating the influence of previous run-time data on the scheduling decisions, we turn our attention towards the properties of the workloads and therefore compare different types of memory management.
  • Thumbnail Image
    ItemOpen Access
    Weighted independent colorful sets in large vertex colored conflict graphs for timetriggered flow scheduling
    (2022) Rönsch, Lorenz
    The need for time sensitive communication on networks is increasing more and more, especially due to Industrial internet of things and Industry 4.0. With the appearance of graphics-based network participants in time-critical networks, such as VR glasses, the absolute amount of traffic that needs to be scheduled over the network increases strongly. The most common method to realize real time communication is using the IEEE Time-Sensitive Network (TSN) and the Time-Aware Shaper (TAS). However, the TSN schedule calculation is not standardized. There are several approaches, such as SMT solver, integer linear programming and calculating a conflict graph to calculate time-triggered flow schedules. But none of them are tackling the problem of maximizing traffic. In our work, we extend the time-triggered flow scheduling problem to include the component of maximum traffic. For this purpose, we modify an existing heuristic, called Greedy Flow Heap Heuristic, so that we can adapt the scheduling to our problem. The results that our version provides compared to the original heuristic are very promising. On all our evaluation data, we achieved an average improvement of 81.91% in terms of maximum network traffic. We also developed an alternative non-deterministic approach based on a genetic algorithm. In our work we investigate different variants of the algorithm with the goal to provide better results with different adaptations of the algorithm. In our repair version, we manage to beat our benchmark algorithm the Greedy Flow Heap Heuristic on every circle based conflict graph.
  • Thumbnail Image
    ItemOpen Access
    Investigation of self-learned zeroth-order optimization algorithms
    (2022) Schüttler, Kilian
    Designing optimization algorithms manually is a laborious process. In Addition, many optimization algorithms rely on hand-crafted heuristics and perform poorly in applications for which they are not specifically designed. Thus, automating the algorithm design process is very appealing. Moreover, learned algorithms minimize the amount of a priori assumptions and do not rely on hyperparameters after training. Several works exist that present methods to learn an optimization algorithm. In this project, we focus on the reinforcement learning perspective. Therefore, any particular optimization algorithm is represented as a policy. Evaluation of the existing methods shows, learned algorithms outperform existing algorithms in terms of convergence speed and final objective value on particular training tasks. However, the inner mechanisms of learned algorithms largely remain a mystery. A first work has discovered that learned first-order algorithms show a set of intuitive mechanisms that are tuned to the training task. We aim to explore the inner workings of learned zeroth-order algorithms and compare our discoveries to previous works. To address this issue, we study properties of learned zeroth-order algorithms to understand the relationship between what is learned and the quantitative and qualitative properties, e.g., curvature or convexity of the objective function. Furthermore, we study the generalization in relation to these properties. Moreover, we explore the feasibility of finetuning a learned zeroth-order optimization algorithm to a related objective function. Finally we provide guidelines for training and application of learned zeroth-order optimization algorithms.
  • Thumbnail Image
    ItemOpen Access
    Implementing a Cholesky decomposition using SYCL
    (2025) Bloch, Michal
    PLSSVM, an LS-SVM implementation, now only uses the Conjugate Gradient algorithm for solving a set of linear equations. However, for an ill-conditioned matrix, it especially gets into trouble, as the converged solution drifts away from the actual solution due to rounding errors. Therefore, this thesis implements a different solver, e.g., the Cholesky Decomposition, which will be implemented in SYCL. We will implement multiple variations of the Cholesky Decomposition algorithm, including a blocked version, and utilize many different features of SYCL. The focus will primarily be on the fastest implementations. In the end, the fastest implementation will be integrated into PLSSVM alongside a Forward and Backward Substitution implementation for solving the set of linear equations. We will conclude with a runtime comparison between the implementations, a comparison of our best Cholesky Decomposition with the Conjugate Gradient using a dataset and a small discussion about numerical errors.
  • Thumbnail Image
    ItemOpen Access
    Polynomial Chaos Expansion mit räumlich adaptiven Sparse Grids
    (2020) Albrecht, Thomas
    Die Polynomial Chaos Expansion (generalized Polynomial Chaos) ist eine Methode aus der Uncertainty Quantification. Mit ihr können die stochastischen Momente einer Funktion R, deren Parameter gemäß Verteilungsfunktionen verteilt sind, schnell berechnet werden. Außerdem liefert die Methode ein akkurates Surrogat das effizient ausgewertet werden kann. Dazu werden abhängig von den Verteilungsfunktionen in jeder Dimension bestimmte orthogonale Polynombasen verwendet. Die verwendeten Polynome sind jeweils genau bezüglich der Dichtefunktion der jeweiligen Verteilung orthogonal. Die Koeffizienten der Expansion werden mithilfe eines Spectral Projection Ansatzes berechnet. Bei diesem müssen hochdimensionale Integrale gelöst werden. In dieser Arbeit wird die Polynomial Chaos Expansion implementiert. Da traditionelle Integrationsmethoden unter dem sogenannten Fluch der Dimensionalität leiden, werden die Integrale mithilfe von räumlich adaptiven Sparse Grids (dünnen Gittern) gelöst. Die Ergebnisse der Polynomial Chaos Expansion mit räumlich adaptiven Sparse Grids werden mit denen einer Polynomial Chaos Expansion mit regulären Sparse Grids verglichen. Außerdem werden die Ergebnisse noch mit einer weiteren Methode aus der Uncertainty Quantification, der Stochastic Collocation, verglichen.
  • Thumbnail Image
    ItemOpen Access
    Parameter-dependent self-learning optimization
    (2022) Abu El Komboz, Tareq
    Manually developing optimization algorithms is a time-consuming task requiring expert knowledge. Therefore, it makes a lot of sense to automate the design process of such algorithms. Additionally, learned optimization algorithms reduce the number of a priori assumptions made about the characteristics of the underlying objective function. Numerous works discuss possibilities for learning optimization algorithms. This field of study is called learn-to-optimize. In this bachelor’s thesis, we concentrate on the reinforcement learning perspective. Consequently, optimization algorithms are represented as policies. The comparison of learned algorithms to current state-of-the-art algorithms for particular applications reveals that learned algorithms manage to perform better concerning convergence speed and final objective function value. However, most existing approaches only consider fixed sets of parameters to be optimized. Because of this, it is challenging to adapt the learned optimization algorithm to other objective functions. More importantly, it is impossible to optimize when explicit constraints on so-called “free” optimization parameters are given. We investigated the learn-to-optimize approach under various optimization parameter sets and conditions on “free” parameters to solve this problem. Furthermore, we studied the performance of learned optimizers in high-dimensional setups.
  • Thumbnail Image
    ItemOpen Access
    Advanced object localization pipeline for robot manipulation
    (2020) Schäfer, Tim
    The thesis objective is to develop a robust tracking pipeline for a Baxter robot system. The tracking pipeline includes the applications of robot tracking and object tracking, which localizes the surrounding objects for robot manipulation. By combining, extending, and comparing state-of-theart approaches for object tracking and image segmentation, this thesis estimates the initial poses of the objects such that their pose can be used to auto-initialize object tracking algorithms. The pipeline is able to handle unfamiliar and dynamic environments. The system was implemented in simulation and can be applied to a real robotic system. The resulting pipeline uses an RGB image, a depth image, and the 3D object models as input. The outputs are the tracked object poses in real time. A dataset was generated to train the instance segmentation network. Furthermore, the pipeline was evaluated with several conditions to test the robustness of the tracking. A comparison to other 6D pose estimation approaches is provided in the results. The code of the pipeline is available on GitHub: https://github.com/timschaeferde/rai_baxter