13 Zentrale Universitätseinrichtungen

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/14

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    ItemOpen Access
    Lustre I/O performance investigations on Hazel Hen : experiments and heuristics
    (2021) Seiz, Marco; Offenhäuser, Philipp; Andersson, Stefan; Hötzer, Johannes; Hierl, Henrik; Nestler, Britta; Resch, Michael
    With ever-increasing computational power, larger computational domains are employed and thus the data output grows as well. Writing this data to disk can become a significant part of runtime if done serially. Even if the output is done in parallel, e.g., via MPI I/O, there are many user-space parameters for tuning the performance. This paper focuses on the available parameters for the Lustre file system and the Cray MPICH implementation of MPI I/O. Experiments on the Cray XC40 Hazel Hen using a Cray Sonexion 2000 Lustre file system were conducted. In the experiments, the core count, the block size and the striping configuration were varied. Based on these parameters, heuristics for striping configuration in terms of core count and block size were determined, yielding up to a 32-fold improvement in write rate compared to the default. This corresponds to 85 GB/s of the peak bandwidth of 202.5 GB/s. The heuristics are shown to be applicable to a small test program as well as a complex application.
  • Thumbnail Image
    ItemOpen Access
    Container orchestration on HPC systems through Kubernetes
    (2021) Zhou, Naweiluo; Georgiou, Yiannis; Pospieszny, Marcin; Zhong, Li; Zhou, Huan; Niethammer, Christoph; Pejak, Branislav; Marko, Oskar; Hoppe, Dennis
    Containerisation demonstrates its efficiency in application deployment in Cloud Computing. Containers can encapsulate complex programs with their dependencies in isolated environments making applications more portable, hence are being adopted in High Performance Computing (HPC) clusters. Singularity, initially designed for HPC systems, has become their de facto standard container runtime. Nevertheless, conventional HPC workload managers lack micro-service support and deeply-integrated container management, as opposed to container orchestrators. We introduce a Torque-Operator which serves as a bridge between HPC workload manager (TORQUE) and container orchestrator (Kubernetes). We propose a hybrid architecture that integrates HPC and Cloud clusters seamlessly with little interference to HPC systems where container orchestration is performed on two levels.
  • Thumbnail Image
    ItemOpen Access
    Fourth-order paired-explicit Runge-Kutta methods
    (2025) Doehring, Daniel; Christmann, Lars; Schlottke-Lakemper, Michael; Gassner, Gregor; Torrilhon, Manuel
    In this paper, we extend the Paired-Explicit Runge-Kutta (P-ERK) schemes by Vermeire et al. (J Comput Phys 393:465-483, 2019) and Nasab and Vermeire (J Comput Phys 468:111470, 2022) to fourth-order of consistency. Based on the order conditions for partitioned Runge-Kutta methods we motivate a specific form of the Butcher arrays which leads to a family of fourth-order accurate methods. The employed form of the Butcher arrays results in a special structure of the stability polynomials, which needs to be adhered to for an efficient optimization of the domain of absolute stability. We demonstrate that the constructed fourth-order P-ERK methods satisfy linear stability, internal consistency, designed order of convergence, and conservation of linear invariants. At the same time, these schemes are seamlessly coupled for codes employing a method-of-lines approach, in particular without any modifications of the spatial discretization. We demonstrate speedup for single-threaded program executions, shared-memory parallelism, i.e., multi-threaded executions and distributed-memory parallelism with MPI. We apply the multirate P-ERK schemes to inviscid and viscous problems with locally varying wave speeds, which may be induced by non-uniform grids or multiscale properties of the governing partial differential equation. Compared to state-of-the-art optimized standalone methods, the multirate P-ERK schemes allow significant reductions in right-hand-side evaluations and wall-clock time, ranging from up to factors greater than four. A reproducibility repository is provided which enables the reader to examine all results presented in this work.
  • Thumbnail Image
    ItemOpen Access
    Governance of high-risk AI systems in healthcare and credit scoring
    (2025) Bartsch, Sebastian; Behn, Oliver; Benlian, Alexander; Brownsword, Roger; Bücker, Sebastian; Düwell, Marcus; Formánek, Nico; Jungtäubl, Marc; Leyer, Michael; Richter, Alexander; Schmidt, Jan-Hendrik; Will-Zocholl, Mascha