13 Zentrale Universitätseinrichtungen

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/14

Browse

Search Results

Now showing 1 - 5 of 5
  • Thumbnail Image
    ItemOpen Access
    Improving the MPI-IO performance of applications with genetic algorithm based auto-tuning
    (2021) Bagbaba, Ayse; Wang, Xuan
    Parallel I/O is an essential part of scientific applications running on high-performance computing systems. Under- standing an application’s parallel I/O behavior and identifying sources of performance bottlenecks require a multi-layer view of the I/O. Typical parallel I/O stack layers offer many tunable parameters that can achieve the best possible I/O performance. However, scientific users do often not have the time nor the experience for investigating the proper combination of these parameters for each application use-case. Auto-tuning can help users by automatically tuning I/O parameters at various layers transparently. In auto-tuning, using naive strategy, running an application by trying all possible combinations of tunable parameters for all layers of the I/O stack to find the best settings is an exhaustive search through the huge parameter space. This strategy is infeasible because of the long execution times of trial runs. In this paper, we propose a genetic algorithm-based parallel I/O auto-tuning approach that can hide the complexity of the I/O stack from users and auto-tune a set of parameter values for an application on a given system to improve the I/O performance. In particular, our approach tests a set of parameters and then, modifies the combination of these parameters for further testing based on the I/O performance. We have validated our model using two I/O benchmarks, namely IOR and MPI-Tile-IO. We achieved an increase in I/O bandwidth of up to 7.74×over the default parameters for IOR and 5.59× over the default parameters for MPI-Tile-IO.
  • Thumbnail Image
    ItemOpen Access
    Urban digital twins for smart cities and citizens : the case study of Herrenberg, Germany
    (2020) Dembski, Fabian; Wössner, Uwe; Letzgus, Mike; Ruddat, Michael; Yamu, Claudia
    Cities are complex systems connected to economic, ecological, and demographic conditions and change. They are also characterized by diverging perceptions and interests of citizens and stakeholders. Thus, in the arena of urban planning, we are in need of approaches that are able to cope not only with urban complexity but also allow for participatory and collaborative processes to empower citizens. This to create democratic cities. Connected to the field of smart cities and citizens, we present in this paper, the prototype of an urban digital twin for the 30,000-people town of Herrenberg in Germany. Urban digital twins are sophisticated data models allowing for collaborative processes. The herein presented prototype comprises (1) a 3D model of the built environment, (2) a street network model using the theory and method of space syntax, (3) an urban mobility simulation, (4) a wind flow simulation, and (5) a number of empirical quantitative and qualitative data using volunteered geographic information (VGI). In addition, the urban digital twin was implemented in a visualization platform for virtual reality and was presented to the general public during diverse public participatory processes, as well as in the framework of the “Morgenstadt Werkstatt” (Tomorrow’s Cities Workshop). The results of a survey indicated that this method and technology could significantly aid in participatory and collaborative processes. Further understanding of how urban digital twins support urban planners, urban designers, and the general public as a collaboration and communication tool and for decision support allows us to be more intentional when creating smart cities and sustainable cities with the help of digital twins. We conclude the paper with a discussion of the presented results and further research directions.
  • Thumbnail Image
    ItemOpen Access
    Improving collective I/O performance with machine learning supported auto-tuning
    (2020) Bagbaba, Ayse
    Collective Input and output (I/O) is an essential approach in high performance computing (HPC) applications. The achievement of effective collective I/O is a nontrivial job due to the complex interdependencies between the layers of I/O stack. These layers provide the best possible I/O performance through a number of tunable parameters. Sadly, the correct combination of parameters depends on diverse applications and HPC platforms. When a configuration space gets larger, it becomes difficult for humans to monitor the interactions between the configuration options. Engineers has no time or experience for exploring good configuration parameters for each problem because of long benchmarking phase. In most cases, the default settings are implemented, often leading to poor I/O efficiency. I/O profiling tools can not tell the optimal default setups without too much effort to analyzing the tracing results. In this case, an auto-tuning solution for optimizing collective I/O requests and providing system administrators or engineers the statistic information is strongly required. In this paper, a study of the machine learning supported collective I/O auto-tuning including the architecture and software stack is performed. Random forest regression model is used to develop a performance predictor model that can capture parallel I/O behavior as a function of application and file system characteristics. The modeling approach can provide insights into the metrics that impact I/O performance significantly.
  • Thumbnail Image
    ItemOpen Access
    Lustre I/O performance investigations on Hazel Hen : experiments and heuristics
    (2021) Seiz, Marco; Offenhäuser, Philipp; Andersson, Stefan; Hötzer, Johannes; Hierl, Henrik; Nestler, Britta; Resch, Michael
    With ever-increasing computational power, larger computational domains are employed and thus the data output grows as well. Writing this data to disk can become a significant part of runtime if done serially. Even if the output is done in parallel, e.g., via MPI I/O, there are many user-space parameters for tuning the performance. This paper focuses on the available parameters for the Lustre file system and the Cray MPICH implementation of MPI I/O. Experiments on the Cray XC40 Hazel Hen using a Cray Sonexion 2000 Lustre file system were conducted. In the experiments, the core count, the block size and the striping configuration were varied. Based on these parameters, heuristics for striping configuration in terms of core count and block size were determined, yielding up to a 32-fold improvement in write rate compared to the default. This corresponds to 85 GB/s of the peak bandwidth of 202.5 GB/s. The heuristics are shown to be applicable to a small test program as well as a complex application.
  • Thumbnail Image
    ItemOpen Access
    Container orchestration on HPC systems through Kubernetes
    (2021) Zhou, Naweiluo; Georgiou, Yiannis; Pospieszny, Marcin; Zhong, Li; Zhou, Huan; Niethammer, Christoph; Pejak, Branislav; Marko, Oskar; Hoppe, Dennis
    Containerisation demonstrates its efficiency in application deployment in Cloud Computing. Containers can encapsulate complex programs with their dependencies in isolated environments making applications more portable, hence are being adopted in High Performance Computing (HPC) clusters. Singularity, initially designed for HPC systems, has become their de facto standard container runtime. Nevertheless, conventional HPC workload managers lack micro-service support and deeply-integrated container management, as opposed to container orchestrators. We introduce a Torque-Operator which serves as a bridge between HPC workload manager (TORQUE) and container orchestrator (Kubernetes). We propose a hybrid architecture that integrates HPC and Cloud clusters seamlessly with little interference to HPC systems where container orchestration is performed on two levels.