13 Zentrale Universitätseinrichtungen
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/14
Browse
1 results
Search Results
Item Open Access MPI-semantic memory checking tools for parallel applications(2013) Fan, Shiqing; Resch, Michael (Prof. Dr.-Ing.)The Message Passing Interface (MPI) is a language-independent application interface that provides a standard for communication among the processes of programs running on parallel computers, clusters or heterogeneous networks. However, writing correct and portable MPI applications is difficult: inconsistent or incorrect use of parameters may occur; the subtle semantic differences of various MPI calls may be used inconsistently or incorrectly even by expert programmers. The MPI implementations typically implement only minimal sanity checks to achieve the highest possible performance. Although many interactive debuggers have been developed or extended to handle the concurrent processes of MPI applications, there are still numerous classes of bugs which are hard or even impossible to find with a conventional debugger. There are many cases of memory conflicts or errors, for example, overlapping access or segmentation fault, does not provide enough and useful information for programmer to solve the problem. That is even worse for MPI applications, due to the flexibility and high-frequency of using memory parallel in MPI standard, which makes it more difficult to observe the memory problems in the traditional way. Currently, there is no available debugger helpful especially for MPI semantic memory errors, i.e. detecting memory problem or potential errors according to the standard. For this specific c purpose, in this dissertation memory checking tools have been implemented. And the corresponding frameworks in Open MPI for parallel applications based on MPI semantics have been developed, using different existing memory debugging tool interfaces. Developers are able to detect hard to find bugs, such as memory violations, buffer overrun, inconsistent parameters and so on. This memory checking tool provides detailed comprehensible error messages that will be most helpful for MPI developers. Furthermore, the memory checking frameworks may also help improve the performance of MPI based parallel applications by detecting whether the communicated data is used or not. The new memory checking tools may also be used in other projects or debuggers to perform different memory checks. The memory checking tools do not only apply to MPI parallel applications, but may also be used in other kind of applications that require memory checking. The technology allows programmers to handle and implement their own memory checking functionalities in a flexible way, which means they may define what information they want to know about the memory and how the memory in the application should be checked and reported. The world of high performance computing is Linux-dominated and open source based. However Microsoft is becoming also a more important role in this domain, establishing its foothold with Windows HPC Server 2008 R2. In this work, the advantages and disadvantages of these two HPC operating systems will be discussed. To amend programmability and portability, we introduce a version of Open MPI for Windows with several newly developed key components. Correspondingly, an implementation of memory checking tool on Windows will also be introduced. This dissertation has five main chapters: after an introduction of state of the art, the development of the Open MPI for Windows platform is described, including the work of InfiniBand network support. Chapter four presents the methods explored and opportunities for error analysis of memory accesses. Moreover, it also describes the two implemented tools for this work based on the Intel PIN and the Valgrind tool, as well as their integration into the Open MPI library. In chapter five, the methods are based on several benchmarks (NetPIPE, IMB and NPB) and evaluated using real applications (heat conduction application, and the MD package Gromacs). It is shown that the instrumentation generated by the tool has no significant overhead (NetPIPE with 1.2% to 2.5% for the latency) and accordingly no impact on application benchmarks such as NPB or Gromacs. If the application is executed to analyze with the memory access tools, it extends naturally the execution time by up to 30x, and using the presented MemPin is only half the rate of dropdown. The methods prove successful in the sense that unnecessary data communicated can be found in the heat conduction application and in Gromacs, resulting in the first case, the communication time of the application is reduced by 12%.