13 Zentrale Universitätseinrichtungen

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/14

Browse

Search Results

Now showing 1 - 9 of 9
  • Thumbnail Image
    ItemOpen Access
    Process migration in a parallel environment
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2016) Reber, Adrian; Resch, Michael (Prof. Dr.- Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    To satisfy the ever increasing demand for computational resources, high performance computing systems are becoming larger and larger. Unfortunately, the tools supporting system management tasks are only slowly adapting to the increase in components in computational clusters. Virtualization provides concepts which make system management tasks easier to implement by providing more flexibility for system administrators. With the help of virtual machine migration, the point in time for certain system management tasks like hardware or software upgrades no longer depends on the usage of the physical hardware. The flexibility to migrate a running virtual machine without significant interruption to the provided service makes it possible to perform system management tasks at the optimal point in time. In most high performance computing systems, however, virtualization is still not implemented. The reason for avoiding virtualization in high performance computing is that there is still an overhead accessing the CPU and I/O devices. This overhead continually decreases and there are different kind of virtualization techniques like para-virtualization and container-based virtualization which minimize this overhead further. With the CPU being one of the primary resources in high performance computing, this work proposes to migrate processes instead of virtual machines thus avoiding any overhead. Process migration can either be seen as an extension to pre-emptive multitasking over system boundaries or as a special form of checkpointing and restarting. In the scope of this work process migration is based on checkpointing and restarting as it is already an established technique in the field of fault tolerance. From the existing checkpointing and restarting implementations, the best suited implementation for process migration purposes was selected. One of the important requirements of the checkpointing and restarting implementation is transparency. Providing transparent process migration is important enable the migration of any process without prerequisites like re-compilation or running in a specially prepared environment. With process migration based on checkpointing and restarting, the next step towards providing process migration in a high performance computing environment is to support the migration of parallel processes. Using MPI is a common method of parallelizing applications and therefore process migration has to be integrated with an MPI implementation. The previously selected checkpointing and restarting implementation was integrated in an MPI implementation, and thus enabling the migration of parallel processes. With the help of different test cases the implemented process migration was analyzed, especially in regards to the time required to migrated a process and the advantages of optimizations to reduce the process’ downtime during migration.
  • Thumbnail Image
    ItemOpen Access
    Increased flexibility and dynamics in distributed applications and processes through resource decoupling
    (2014) Kipp, Alexander; Resch, Michael (Prof. Dr.-Ing.)
    Continuously increasing complexity of products and services requires more and more specialised expertise as well as relevant support by specialised IT tools and services. However, these services require expert knowledge as well, particularly in order to apply and use these services and tools in an efficient and optimal way. To this end, this thesis introduces a new virtualisation approach, allowing for both, the transparent integration of services in abstract process description languages, as well as the role based integration of human experts in this processes. The developed concept of this thesis has been realised by: - Enhancing the concept of web services with a service virtualisation layer, allowing for the transparent usage, adaptation and orchestration of services - Enhancing the developed concept towards a “Dynamic Session Management” environment, enabling the transparent and role-based integration of human experts following the SOA paradigm - Developing a collaboration schema, allowing for setting up and steering synchronous collaboration sessions between human experts. This enhancement also considers the respective user context and provides the best suitable IT based tooling support. The developed concept has been applied to scientific and economic application fields with a respective reference realisation.
  • Thumbnail Image
    ItemOpen Access
    Hybrid parallel computing beyond MPI & OpenMP - introducing PGAS & StarSs
    (2011) Sethi, Muhammad Wahaj
    High-performance architectures are becoming more and more complex with the passage of time. These large scale, heterogeneous architectures and multi-core system are difficult to program. New programming models are required to make expression of parallelism easier, while keeping productivity of the developer higher. Partition Global Address-space (PGAS) languages such as UPC appeared to augment developer’s productivity for distributed memory systems. UPC provides a simpler, shared memory-like model with a user control over data layout. But it is developer’s responsibility to take care of the data locality, by using appropriate data layouts. SMPSs/StarSs programming model tries to simplify the parallel programming on multicore architectures. It offers task level parallelism, where dependencies among the tasks are determined at the run time. In addition, runtime take cares of the data locality, while scheduling tasks. Hence, providing two-folds improvement in productivity; first, saving developer’s time by using automatic dependency detection, instead of hard coding them. Second, save cache optimization time, as runtime take cares of data locality. The purpose of this thesis is to use the PGAS programming model e.g. UPC for different nodes with the shared memory task based parallelization model i.e. StarSs to take the advantage of the multi core systems and contrast this approach to the legacy MPI and OpenMP combination. Performance as well as programmability is considered in the evaluation. The combination UPC + SMPSs, results in approximately the same execution time as MPI and OpenMP. The current lack of features such as multi-dimensional data distribution or virtual topologies in UPC, make the hybrid UPC + SMPSs/StarSs programming model less programmable than MPI + OpenMP for the application studied in this thesis.
  • Thumbnail Image
    ItemOpen Access
    Service level agreements for job submission and scheduling in high performance computing
    (2014) Kübert, Roland; Resch, Michael (Prof. Dr.-Ing.)
    This thesis introduces the concept of long-term service level agreements for the offering of quality of service in high performance computing. Feasiblity of the approach is demonstrated by a proof of concept implementation. A simulation tool developed in the scope of this thesis is subsequently used to investigate sensible parameters for quality of service classes in the high performance computing domain.
  • Thumbnail Image
    ItemOpen Access
    Enhanced SLA management in the high performance computing domain
    (2011) Koller, Bastian; Resch, Michael (Prof. Dr.-Ing. Dr. h.c.)
    This thesis describes a Service Level Agreement Schema for the High Performance Computing domain and the according architecture to allow for SLA Management, which are both developed on base of three different use cases.
  • Thumbnail Image
    ItemOpen Access
    MPI-semantic memory checking tools for parallel applications
    (2013) Fan, Shiqing; Resch, Michael (Prof. Dr.-Ing.)
    The Message Passing Interface (MPI) is a language-independent application interface that provides a standard for communication among the processes of programs running on parallel computers, clusters or heterogeneous networks. However, writing correct and portable MPI applications is difficult: inconsistent or incorrect use of parameters may occur; the subtle semantic differences of various MPI calls may be used inconsistently or incorrectly even by expert programmers. The MPI implementations typically implement only minimal sanity checks to achieve the highest possible performance. Although many interactive debuggers have been developed or extended to handle the concurrent processes of MPI applications, there are still numerous classes of bugs which are hard or even impossible to find with a conventional debugger. There are many cases of memory conflicts or errors, for example, overlapping access or segmentation fault, does not provide enough and useful information for programmer to solve the problem. That is even worse for MPI applications, due to the flexibility and high-frequency of using memory parallel in MPI standard, which makes it more difficult to observe the memory problems in the traditional way. Currently, there is no available debugger helpful especially for MPI semantic memory errors, i.e. detecting memory problem or potential errors according to the standard. For this specific c purpose, in this dissertation memory checking tools have been implemented. And the corresponding frameworks in Open MPI for parallel applications based on MPI semantics have been developed, using different existing memory debugging tool interfaces. Developers are able to detect hard to find bugs, such as memory violations, buffer overrun, inconsistent parameters and so on. This memory checking tool provides detailed comprehensible error messages that will be most helpful for MPI developers. Furthermore, the memory checking frameworks may also help improve the performance of MPI based parallel applications by detecting whether the communicated data is used or not. The new memory checking tools may also be used in other projects or debuggers to perform different memory checks. The memory checking tools do not only apply to MPI parallel applications, but may also be used in other kind of applications that require memory checking. The technology allows programmers to handle and implement their own memory checking functionalities in a flexible way, which means they may define what information they want to know about the memory and how the memory in the application should be checked and reported. The world of high performance computing is Linux-dominated and open source based. However Microsoft is becoming also a more important role in this domain, establishing its foothold with Windows HPC Server 2008 R2. In this work, the advantages and disadvantages of these two HPC operating systems will be discussed. To amend programmability and portability, we introduce a version of Open MPI for Windows with several newly developed key components. Correspondingly, an implementation of memory checking tool on Windows will also be introduced. This dissertation has five main chapters: after an introduction of state of the art, the development of the Open MPI for Windows platform is described, including the work of InfiniBand network support. Chapter four presents the methods explored and opportunities for error analysis of memory accesses. Moreover, it also describes the two implemented tools for this work based on the Intel PIN and the Valgrind tool, as well as their integration into the Open MPI library. In chapter five, the methods are based on several benchmarks (NetPIPE, IMB and NPB) and evaluated using real applications (heat conduction application, and the MD package Gromacs). It is shown that the instrumentation generated by the tool has no significant overhead (NetPIPE with 1.2% to 2.5% for the latency) and accordingly no impact on application benchmarks such as NPB or Gromacs. If the application is executed to analyze with the memory access tools, it extends naturally the execution time by up to 30x, and using the presented MemPin is only half the rate of dropdown. The methods prove successful in the sense that unnecessary data communicated can be found in the heat conduction application and in Gromacs, resulting in the first case, the communication time of the application is reduced by 12%.
  • Thumbnail Image
    ItemOpen Access
    Analyse und Optimierung der Softwareschichten von wissenschaftlichen Anwendungen für Metacomputing
    (2008) Keller, Rainer; Resch, Michael (Prof. Dr.-Ing.)
    Für parallele Anwendungen ist das Message Passing Interface (MPI) das Programmierparadigma der Wahl für Höchstleistungsrechner mit verteiltem Speicher. Mittels des Konzeptes des MetaComputings wiederum können verschiedenste Rechenressourcen mit PACX-MPI gekoppelt werden. Dies ist einerseits von Interesse, weil Problemgrößen gelöst werden sollen, die nicht auf nur einem System ausgeführt werden könnten, andererseits, weil gekoppelte Simulationen gerechnet werden, die auf bestimmten Rechnerarchitekturen ausgeführt werden sollen oder weil Systeme mit bestimmten Eigenschaften wie Visualisierungs- mit parallelen Rechenressourcen verbunden werden müssen. Diese Koppelung stellt für die verteilten Anwendungen eine Barriere dar, da Kommunikation zu nicht-lokalen Prozessen weitaus langsamer ist, als über das rechnerinterne Netzwerk. In dieser Arbeit werden Lösungen auf den Software-Ebenen ausgehend von der Netzwerkschicht, durch Verbesserungen innerhalb der verwendeten Middleware, bis hin zur Optimierung innerhalb der Anwendungsschicht erarbeitet. In Bezug auf die unterste Softwareschicht wird für die Middleware PACX-MPI eine allgemeine Bibliothek zur Netzwerkkommunikation auf Basis von User Datagram Protocol (UDP) entwickelt. Somit können Limitierungen des Transport Control Protocols (TCP) umgangen werden, vor allem in Verbindung mit Netzwerken mit hoher Latenz und großer Bandbreite, so genannte Long Fat Pipes. Die hier implementierte Bibliothek ist portabel programmiert und durch die Verwendung von Threads effizient. Dieses Protokoll erreicht gute Werte für die Bandbreite im Local Area Network (LAN), aber auch im Wide Area Network (WAN). Getestet wird dieses Protokoll zur Veranschaulichung mittels einer Verbindung zwischen Rechnern in Stuttgart und Canberra, Australien. Innerhalb der Middleware wird die Optimierung der kollektiven Kommunikationsroutinen behandelt und am Beispiel der Funktion PACX_Alltoall die Verbesserung anhand des IMB Benchmarks auf einem Metacomputer gezeigt. Zur Analyse der Kommunikationseigenschaften wird die Erweiterung einer Tracing-Bibliothek für PACX-MPI, sowie die Implementierung einer generischen Schnittstelle zur Messung der Kommunikationscharakteristik auf MPI-Schicht erläutert. Weiterhin wird eine allgemeine MPI-Testsuite vorgestellt, die beim Auffinden von Fehlern sowohl in PACX-MPI, als auch innerhalb der Open MPI Implementierung hilfreich war. Auf der obersten Softwareschicht werden Optimierungsmöglichkeiten für Anwendungen für MetaComputing aufgezeigt. Beispielhaft wird die Analyse des Kommunikationsmusters einer Anwendung aus dem Bereich der Bioinformatik gezeigt. Weiterhin wird die Implementierung des Cachings und Prefetchings von vielfach kommunizierten Daten mit räumlicher und zeitlicher Lokalität vorgestellt. Erst die Methodik des Cachings und Prefetchings erlaubt die Ausführung der Anwendung in einem Metacomputer und ist exemplarisch für eine Klasse von Algorithmen mit ähnlichem Kommunikationsmuster.
  • Thumbnail Image
    ItemOpen Access
    Integrated management framework for dynamic virtual organisations
    (2008) Wesner, Stefan; Resch, Michael (Prof. Dr.-Ing.)
    This thesis describes an Service Level Agreement based model for dynamic virtual organisations and a corresponding management framework for service providers making them able to fullfill such SLAs. The proposed framework is realised as a hierachical model starting from low level management close the hardware and network primitives necessary to realise the services up to the business relationship management layer. The concept is instantiated for the scenario of a High Performance Computing service provider.
  • Thumbnail Image
    ItemOpen Access
    Management von verteilten ingenieurwissenschaftlichen Anwendungen in heterogenen Grid-Umgebungen
    (2007) Lindner, Peggy; Resch, Michael (Prof. Dr.-Ing.)
    Grid Technologien stellen einen Lösungsansatz für die Verteilung von Anwendungen über mehrere Rechner dar, um Simulationen von wissenschaftlichen Problemen durchführen zu können, die hohe Anforderungen an Rechenressourcen haben. Während diese Art von Anwendungen in den letzten Jahren meistens noch zu Demonstrationszwecken eingesetzt wurde, ist die Grid Technologie heute mehr und mehr ein Werkzeug im täglichen Einsatz. Dabei ist die Heterogenität der vorhandenen Grid Software Umgebungen das größte Problem mit dem Benutzer umgehen müssen, wenn sie parallele, verteilte Anwendungen effizient im Grid ausführen wollen. Im Rahmen dieser Arbeit wird das Konzept und die Implementierung eines Grid Configuration Managers (GCM) vorgestellt, der die Komplexität der Grid Umgebungen und die damit verbundenen Probleme vor dem Benutzer verbergen soll. Das wichtigste Ziel des GCM ist die Vereinfachung des Managements von Grid Umgebungen für Endanwender und Entwickler. Dafür wurden die für die Ausführung von verteilten, parallelen Anwendungen notwendigen Schritte abstrahiert. Des Weiteren wurde ein Konzept für die Integration verschiedener Grid Software Lösungen entwickelt und implementiert. Zurzeit unterstützt der GCM Globus, UNICORE und ssh basierende Umgebungen. Der GCM soll Benutzer hauptsächlich während drei Phasen der Ausführung von Anwendungen helfen: bei der Definition einer Grid Konfiguration, beim Starten und bei der Überwachung einer Grid Anwendung. Der GCM bietet außerdem noch eine spezielle Unterstützung für verteilte Anwendungen, die auf Basis der Kommunikationsbibliothek PACX-MPI entwickelt wurden. Dafür werden die benötigten Konfigurationsdateien automatisch erstellt und auf den beteiligten Rechnern konsistent gehalten. In den Grid Configuration Manager wurde ein auf Leistungsvorhersage basierender Mechanismus zur Auswahl von Rechenressourcen integriert. Ausgehend von einer durch den Benutzer spezifizierten Vorauswahl an Rechnern kann der GCM anhand einer automatischen Abschätzung von Leistungsdaten einer Anwendung vorhersagen, was die effizienteste Umgebung für die Ausführung der Anwendung ist. Für die Leistungsvorhersage wird das Programm Dimemas benutzt. Dimemas kann eine Vorhersage für das Laufzeitverhalten einer Anwendung anhand von Tracing-Daten und Parameter zur Beschreibung der Hardware treffen. Der Grid Configuration Manager wurde in verschiedenen Szenarien getestet und eingesetzt. Dabei wurde aufgezeigt, dass die Handhabung von verteilten Anwendungen durch die Verwendung des GCM signifikant vereinfacht und die Festlegung der Ausführungsumgebung erleichtert wird.