Universität Stuttgart

Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1

Browse

Search Results

Now showing 1 - 10 of 28
  • Thumbnail Image
    ItemOpen Access
    iWindow - Intelligentes Maschinenfenster
    (Düsseldorf : VDI Verlag, 2018) Sommer, Philipp; Verl, Alexander; Kiefer, Manuel; Rahäuser, Raphael; Müller, Sebastian; Brühl, Jens; Gras, Michael; Berckmann, Eva; Stautner, Marc; Schäfer, D.; Schotte, Wolfgang; Do-Khac, Dennis; Neyrinck, Adrian; Eger, Ulrich; Sommer, Philipp
    Das Verbundforschungsprojekt iWindow: Intelligentes Maschinenfenster beschäftigte sich mit der visuellen Unterstützung von Maschinenbedienern an Werkzeugmaschinen. Diese konnten bisher nur auf wenige bis keine Systeme, die sie bei ihren täglichen Aufgaben direkt an der Werkzeugmaschine unterstützen, zurückgreifen. Das Forschungsprojekt verbindet reale und virtuelle Welt in der Werkzeugmaschine durch Technologien wie Virtual und Augmented Reality, digitaler Zwilling, Simulation und Mehrwertdienste. Durch Nutzung jeweils für die aktuelle Arbeitssituation passender Dienste, werden Mitarbeiter befähigt, sich an die steigende Individualisierung der Produkte und die flexiblere Produktion anzupassen. Kunden und Geschäftspartner werden durch die Möglichkeit eigene mehrwertgenerierende Dienste zu entwickeln und anderen Anwendern zur Verfügung zu stellen in den Wertschöpfungsprozess eingebunden. Diese Publikation beleuchtet die im Rahmen des Forschungsprojekts erarbeiteten Ergebnisse hinsichtlich für ein intelligentes Maschinenfenster benötigter Technologien und Entwicklungen.
  • Thumbnail Image
    ItemOpen Access
    Tensorgesteuerte Entwicklung biokompatibler Strukturen
    (2021) Däges, Johannes-Maximilian
    Die vorliegende Arbeit verfolgt den Ansatz einer Topologieoptimierung um würfelförmige Strukturen zu erstellen die ein vorgegebenes elastisches Verhalten haben. Hierzu wird die Methode der Homogenisierung nach G. P. Steven angewendet um aus einem gegeben Steifigkeitstenor Beschränkungen für eine Topologieoptimierung zu entwickeln. Zudem wurde die Mandel-Notation in die Homogenisierung eingebaut. Das langfristige Ziel ist es, Femurnägel aus vielen Einzelstrukturen zusammen zu setzen und so auftretendes Stress shielding im Femurknochen zu verringern. Die Ergebnisse verschiedener Konfigurationen sind durchaus vielversprechend und unterstützen eine weitere Untersuchung des Ansatzes.
  • Thumbnail Image
    ItemOpen Access
    Energieeffizienz von Prozessoren in High Performance Computinganwendungen der Ingenieurwissenschaften
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2018) Khabi, Dmitry; Resch, Michael M. (Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    Im Mittelpunkt dieser Arbeit steht die Frage nach Energieeffizienz im Hochleistungsrechnen (HPC) mit Schwerpunkt auf Zusammenhänge zwischen der elektrischen Leistung der Prozessoren und deren Rechenleistung. In Kapitel 1, Einleitung der folgenden Abhandlungen, werden die Motivation und der Stand der Technik auf dem Gebiet der Strommessung und der Energieeffizienz im HPC und dessen Komponenten erläutert. In den Folgenden Kapiteln 2 und 3 wird eine am Höchstleistungsrechenzentrum Stuttgart (HLRS) entwickelte Messtechnik detailliert diskutiert, die für die Strommessungen im Testcluster angewendet wird. Das Messverfahren der unterschiedlichen Hardwarekomponenten und die Abhängigkeit zwischen deren Stromversorgung, Messgenauigkeit und Messfrequenz werden dargelegt. Im Kapitel 4 der Arbeit beschreibe ich, welchen Zusammenhang es zwischen dem Stromverbrauch eines Prozessors, dessen Konfiguration und darauf ausgeführten Algorithmen gibt. Der Fokus liegt dabei auf den Zusammenhängen zwischen CPU-Frequenz, Grad der Parallelisierung, Rechenleistung und elektrischer Leistung. Für den Effizienzvergleich zwischen den Prozessoren und Algorithmen benutze ich ein Verfahren, das auf eine Approximation in der analytischen Form der Rechen- und der elektrischen Leistung der Prozessoren basiert. In diesem Kapitel wird außerdem gezeigt, dass die Koeffizienten der Approximation, die mehrere Hinweise auf Software und Hardware-Eigenschaften geben, als Basis für die Ausarbeitung eines erweiterten Modells dienen können. Wie im weiteren Verlauf gezeigt wird, berücksichtigen die existierenden Modelle der Rechen- und der elektrischen Leistung nur zum Teil die unterschiedlichen Frequenz-Domains der Hardwarekomponenten. Im Kapitel 5 wird eine Erweiterung des existierenden Modells der Rechenleistung erläutert, mit dessen Hilfe die entsprechenden neuen Eigenschaften der CPU-Architektur teilweise erklärt werden könnten. Die daraus gewonnenen Erkenntnisse sollen helfen, ein Modell zu entwickeln, das sowohl die Rechen- als auch die elektrische Leistung beschreibt. In Kapitel 6 beschreibe ich die Problemstellung der Energieeffizienz eines Hochleistungsrechners. Unter anderem werden die in dieser Arbeit entwickelten Methoden auf eine HPC-Platform evaluiert.
  • Thumbnail Image
    ItemOpen Access
    Über die Lösung der Navier-Stokes-Gleichungen mit Hilfe der Moore-Penrose-Inversen des Laplace-Operators im Vektorraum der Polynomkoeffizienten
    (2024) Große-Wöhrmann, Bärbel; Resch, Michael (Prof. Dr.-Ing.)
    Die bekannten numerischen Standard-Verfahren zur Lösung partieller Differentialgleichungen basieren auf einer räumlichen Diskretisierung des Berechnungsgebiets. Ihre Performance und Skalierbarkeit auf modernen massiv-parallelen Höchstleistungsrechnern ist von der Verfügbarkeit effizienter numerischer Verfahren zur Lösung linearer Gleichungssysteme abhängig. Angesichts grundlegender Herausforderungen erscheint die Entwicklung neuer Lösungsansätze sinnvoll. Ich stelle in dieser Arbeit einen Polynomansatz zur Lösung partieller Differentialgleichungen vor, der nicht auf einer räumlichen Diskretisierung beruht und mit Hilfe der Moore-Penrose-Inversen des Laplace-Operators die Entkopplung der Navier-Stokes-Gleichungen ermöglicht. Dabei ist der Grad der Polynome nicht grundsätzlich beschränkt, so dass eine hohe räumliche Auflösung erreicht werden kann.
  • Thumbnail Image
    ItemOpen Access
    A light weighted semi-automatically I/O-tuning solution for engineering applications
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2017) Wang, Xuan; Resch, Michael M. (Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    Today’s engineering applications running on high performance computing (HPC) platforms generate more and more diverse data simultaneously and require large storage systems as well as extremely high data transfer rates to store their data. To achieve high performance data transfer rate (I/O performance), computer scientists together with HPC manufacturers have developed a lot of innovative solutions. However, how to transfer the knowledge of their solutions to engineers and scientists has become one of the largest barriers. Since the engineers and scientists are experts in their own professional areas, they might not be capable of tuning their applications to the optimal level. Sometimes they might even drop down the I/O performance by mistake. The basic training courses provided by computing centers like HLRS seem to be not sufficient enough to transfer the know-how required. In order to overcome this barrier, I have developed a semi-automatically I/O-tuning solution (SAIO) for engineering applications. SAIO, a light weighted and intelligent framework, is designed to be compatible with as many engineering applications as possible, scalable with large engineering applications, usable for engineers and scientists with little knowledge of parallel I/O, and portable across multiple HPC platforms. Standing upon MPI-IO library allows SAIO to be compatible with MPI-IO based high level I/O libraries, such as parallel HDF5, parallel NetCDF, as well as proprietary and open source software, like Ansys Fluent, WRF Model etc. In addition, SAIO follows current MPI standard, which makes it be portable across many HPC platforms and scalable. SAIO, which is implemented as dynamic library and loaded dynamically, does not require recompiling or changing application's source codes. By simply adding several export directives into their job submission scripts, engineers and scientists will be able to run their jobs more efficiently. Furthermore, an automated SAIO training utility keeps the optimal configurations up to date, without any manuell efforts of user involved.
  • Thumbnail Image
    ItemOpen Access
    Optimizing I/O performance with machine learning supported auto-tuning
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2023) Bağbaba, Ayşe; Resch, Michael M. (Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    Data access is a considerable challenge because of the scalability limitation of I/O. In addition, some applications spend most of their total execution times in I/O. This causes a massive slowdown and wastage of useful computing resources. Unfortunately, there is not any one-size-fits-all solution to the I/O problems, so I/O becomes a limiting factor for such applications. Parallel I/O is an essential technique for scientific applications running on high-performance computing systems. Typically, parallel I/O stacks offer many parameters that need to be tuned to achieve an I/O performance as good as possible. Unfortunately, there is no default best configuration of these parameters; in practice, these differ not only between systems but often also from one application use case to the other. However, scientific users might not have the time or the experience to explore the parameter space sensibly and choose a proper configuration for each application use case. I present a line of solutions to this problem containing a machine learning supported auto-tuning system which uses performance modelling to optimize I/O performance. I demonstrate the value of these solutions across applications and at scale.
  • Thumbnail Image
    ItemOpen Access
    Model-centric task debugging at scale
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2017) Nachtmann, Mathias; Resch, Michael (Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    Chapter 1, Introduction, presents state of the art debugging techniques in high-performance computing. The lack of information out of the programming model, these traditional debugging tools suffer, motivated the model-centric debugging approach. Chapter 2, Technical Background: Parallel Programming Models & Tools, exemplifies the programming models used in the scope of my work. The differences between those models are illustrated, and for the most popular programming models in HPC, examples are attached in this chapter. The chapter also describes Temanejo, the toolchain's front-end, which supports the application developer during his actions. In the following chapter (Chapter 4), Design: Events & Requests in Ayudame, the theory of task" and dependency" representation is stated. The chapter includes the design of different information types, which are later on used for the communication between a programming model and the model-centric debugging approach. In chapter 5, Design: Communication Back-end Ayudame, the design of the back-end tool infrastructure is described in detail. This also includes the problems occurring during the design process and their specific solutions. The concept of a multi-process environment and the usage of different programming models at the same time is also part of this chapter. The following chapter (Chapter 6), Instrumentation of Runtime Systems, briefly describes the information exchange between a programming model and the model-centric debugging approach. The different ways of monitoring and controlling an application through its programming model are illustrated. In chapter 7, Case Study: Performance Debugging, the model-centric debugging approach is used for optimising an application. All necessary optimisation steps are described in detail, with the help of mock-ups. Additionally, a description of the different optimised versions is included in this chapter. The evaluation, done on different hardware architectures, is presented and discussed. This includes not only the behaviour of the versions on different platforms but also architecture specific issues.
  • Thumbnail Image
    ItemOpen Access
    A unified research data infrastructure for catalysis research : challenges and concepts
    (2021) Wulf, Christoph; Beller, Matthias; Boenisch, Thomas; Deutschmann, Olaf; Hanf, Schirin; Kockmann, Norbert; Kraehnert, Ralph; Oezaslan, Mehtap; Palkovits, Stefan; Schimmler, Sonja; Schunk, Stephan A.; Wagemann, Kurt; Linke, David
    Modern research methods produce large amounts of scientifically valuable data. Tools to process and analyze such data have advanced rapidly. Yet, access to large amounts of high‐quality data remains limited in many fields, including catalysis research. Implementing the concept of FAIR data (Findable, Accessible, Interoperable, Reusable) in the catalysis community would improve this situation dramatically. The German NFDI initiative (National Research Data Infrastructure) aims to create a unique research data infrastructure covering all scientific disciplines. One of the consortia, NFDI4Cat, proposes a concept that serves all aspects and fields of catalysis research. We present a perspective on the challenging path ahead. Starting out from the current state, research needs are identified. A vision for a integrating all research data along the catalysis value chain, from molecule to chemical process, is developed. Respective core development topics are discussed, including ontologies, metadata, required infrastructure, IP, and the embedding into research community. This Concept paper aims to inspire not only researchers in the catalysis field, but to spark similar efforts also in other disciplines and on an international level.
  • Thumbnail Image
    ItemOpen Access
    Performance comparison of CFD microbenchmarks on diverse HPC architectures
    (2024) Galeazzo, Flavio C. C.; Garcia-Gasulla, Marta; Boella, Elisabetta; Pocurull, Josep; Lesnik, Sergey; Rusche, Henrik; Bnà, Simone; Cerminara, Matteo; Brogi, Federico; Marchetti, Filippo; Gregori, Daniele; Weiß, R. Gregor; Ruopp, Andreas
    OpenFOAM is a CFD software widely used in both industry and academia. The exaFOAM project aims at enhancing the HPC scalability of OpenFOAM, while identifying its current bottlenecks and proposing ways to overcome them. For the assessment of the software components and the code profiling during the code development, lightweight but significant benchmarks should be used. The answer was to develop microbenchmarks, with a small memory footprint and short runtime. The name microbenchmark does not mean that they have been prepared to be the smallest possible test cases, as they have been developed to fit in a compute node, which usually has dozens of compute cores. The microbenchmarks cover a broad band of applications: incompressible and compressible flow, combustion, viscoelastic flow and adjoint optimization. All benchmarks are part of the OpenFOAM HPC Technical Committee repository and are fully accessible. The performance using HPC systems with Intel and AMD processors (x86_64 architecture) and Arm processors (aarch64 architecture) have been benchmarked. For the workloads in this study, the mean performance with the AMD CPU is 62% higher than with Arm and 42% higher than with Intel. The AMD processor seems particularly suited resulting in an overall shorter time-to-solution.
  • Thumbnail Image
    ItemOpen Access
    Improving the MPI-IO performance of applications with genetic algorithm based auto-tuning
    (2021) Bagbaba, Ayse; Wang, Xuan
    Parallel I/O is an essential part of scientific applications running on high-performance computing systems. Under- standing an application’s parallel I/O behavior and identifying sources of performance bottlenecks require a multi-layer view of the I/O. Typical parallel I/O stack layers offer many tunable parameters that can achieve the best possible I/O performance. However, scientific users do often not have the time nor the experience for investigating the proper combination of these parameters for each application use-case. Auto-tuning can help users by automatically tuning I/O parameters at various layers transparently. In auto-tuning, using naive strategy, running an application by trying all possible combinations of tunable parameters for all layers of the I/O stack to find the best settings is an exhaustive search through the huge parameter space. This strategy is infeasible because of the long execution times of trial runs. In this paper, we propose a genetic algorithm-based parallel I/O auto-tuning approach that can hide the complexity of the I/O stack from users and auto-tune a set of parameter values for an application on a given system to improve the I/O performance. In particular, our approach tests a set of parameters and then, modifies the combination of these parameters for further testing based on the I/O performance. We have validated our model using two I/O benchmarks, namely IOR and MPI-Tile-IO. We achieved an increase in I/O bandwidth of up to 7.74×over the default parameters for IOR and 5.59× over the default parameters for MPI-Tile-IO.