Universität Stuttgart
Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1
Browse
Search Results
Item Open Access Supporting multi-tenancy in Relational Database Management Systems for OLTP-style software as a service applications(2015) Schiller, Oliver; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)The consolidation of multiple tenants onto a single relational database management system (RDBMS) instance, commonly referred to as multi-tenancy, turned out being beneficial since it supports improving the profit margin of the provider and allows lowering service fees, by what the service attracts more tenants. So far, existing solutions create the required multi-tenancy support on top of a traditional RDBMS implementation, i. e., they implement data isolation between tenants, per-tenant customization and further tenant-centric data management features in application logic. This is complex, error-prone and often reimplements efforts the RDBMS already offers. Moreover, this approach disables some optimization opportunities in the RDBMS and represents a conceptual misstep with Separation of Concerns in mind. For the points mentioned, an RDBMS that provides support for the development and operation of a multi-tenant software as a service (SaaS) offering is compelling. In this thesis, we contribute to a multi-tenant RDBMS for OLTP-style SaaS applications by extending a traditional disk-oriented RDBMS architecture with multi-tenancy support. For this purpose, we primarily extend an RDBMS by introducing tenants as first-class database objects and establishing tenant contexts to isolate tenants logically. Using these extensions, we address tenant-aware schema management, for which we present a schema inheritance concept that is tailored to the needs of multi-tenant SaaS applications. Thereafter, we evaluate different storage concepts to store a tenant’s tuples with respect to their scalability. Next, we contribute an architecture of a multi-tenant RDBMS cluster for OLTP-style SaaS applications. At that, we focus on a partitioning solution which is aligned to tenants and allows obtaining independently manageable pieces. To balance load in the proposed cluster architecture, we present a live database migration approach, whose design favors low migration overhead and provides minimal interruption of service.Item Open Access Decoding strategies for syntax-based statistical machine translation(2015) Braune, Fabienne; Maletti, Andreas (Dr.)Provided with a sentence in an input language, a human translator produces a sentence in the desired target language. The advances in artificial intelligence in the 1950s led to the idea of using machines instead of humans to generate translations. Based on this idea, the field of Machine Translation (MT) was created. The first MT systems aimed to map input text into the target translation through the application of hand-crafted rules. While this approach worked well for specific language-pairs on restricted fields, it was hardly extendable to new languages and domains because of the huge amount of human effort necessary to create new translation rules. The increase of computational power enabled Statistical Machine Translation (SMT) in the late 1980s, which addressed this problem by learning translation units automatically from large text collections. Statistical machine translation can be divided into several paradigms. Early systems modeled translation between words while later work extended these to sequences of words called phrases. A common point between word and phrase-based SMT is that the translation process takes place sequentially, which is not well suited to translate between languages where words need to be reordered over (potentially) long distances. Such reorderings led to the implementation of SMT systems based on formalisms that allow to translate recursively instead of sequentially. In these systems, called syntax-based systems, the translation units are modeled with formal grammar productions and translation is performed by assembling the productions of these grammars. This thesis contributes to the field of syntax-based SMT in two ways : (i) the applicability of a new grammar formalism is tested by building the first SMT system based on the local local Multi Bottom-Up Tree Transducer (l-MBOT) (ii) new ways to integrate linguistic annotations in the translation model (instead of the grammar rules) of syntax-based systems are developed.Item Open Access Interactive visual analysis of biomolecular simulations(2015) Krone, Michael; Ertl, Thomas (Prof. Dr.)Molecular dynamics simulations can give detailed insights into the properties of biomolecules on an atomistic level. Improvements in the domain of simulation codes as well as of the available hardware enable the simulation of invariably more complex molecular processes. Molecular simulation is therefore often described as a “computational microscope” that makes it possible to run experiments virtually and thus gain insights into the function of proteins and other biomolecules. Although molecular dynamics simulations have inherent restrictions, the results can partially be obtained more reproducible, reliable, and safely than using wet lab experiments. The scope of application ranges from fundamental questions like the formation of protein conformations or the effect of mutations to complex analyses like synthesis rates of biodiesel in biotechnology or drug binding in medicine. Visualization of the simulation results is an essential part of the interpretation of these virtual experiments. It is so to say the ocular of the computational microscope, which makes the data visible. Interactive visualization facilitates making discoveries, since it fosters an exploratory visual analysis of the data. Molecular models tailored to specific problems illustrate the particular properties of the visualized biomolecules. Examples are abstract representations that show the functional structure of a protein, or molecular surfaces that depict the contact surface between a molecule and a solvent. Available visualization techniques are, however, often not efficient enough to ensure an interactive exploration of large, dynamic data. A more comprehensive analysis that goes beyond this direct visualization of the simulation data is attained through feature extractions, which are executed as part of the visualization. Here, derived features of the simulated biomolecules like potential binding sites for reactants are extracted from the raw data, that is, the positions and elements of the atoms. Similar to the existing visualization techniques, previous analysis methods are in most cases not applicable in real-time and, thus, restricted to static data like single time steps of a simulation. For simulation data, however, processes that extend over a period of time can be of particular interest, for example conformational changes of a protein. Since the available feature extraction methods are not applicable in real-time, the results have to be precomputed for all time steps. Parameter changes imply a costly recalculation for the whole simulation. Hence, an exploratory visual analysis requires new methods that can be applied interactively to dynamic data. In this work, various methods that support the interactive visual analysis of biomolecular processes are introduced and discussed. The presented GPU-accelerated methods are parameterizable in real-time by the user and enable an exploratory analysis on current desktop computers. The basis for a visual exploration is the interactive visualization of complex molecular models for large, dynamic data sets without resorting to precomputed data. Consequently, this allows the user to switch between different representations without delay, which could otherwise disrupt and impair the analysis process. Hence, the user can analyze the dynamics of the simulated biomolecules. To facilitate the visual analysis, several real-time rendering methods have been developed and introduced, which for example enhance depth perception or provide a clearer, less cluttered depiction using non-photorealistic rendering. Based on these techniques, analysis methods and tools have been developed that extract complex properties of the simulated molecules. An example is the detection of cavities and channels, which play an essential role for the function of proteins. This enables analyzing the accessibility of binding sites for reactants, which can trigger an enzymatic reaction, or the permeability of a channel protein for a certain species of solvent molecules. Since the visual analysis of the simulation data is fully interactive, it supports the user not only in verifying existing hypotheses about the properties of the biomolecules but also allows for unexpected findings. Experts from the field of biochemistry have for example been able to find a channel to the binding site of a protein that did not agree with the predicted one. The combination of various analysis methods allows for a comprehensive, consistently interactive, exploratory visual analysis of biomolecular simulations, which gives users detailed insights into the data in real-time and fosters the discovery of new, unanticipated phenomena.Item Open Access On the complexity of conjugacy in amalgamated products and HNN extensions(2015) Weiß, Armin; Diekert, Volker (Prof. Dr. rer. nat. habil.)This thesis deals with the conjugacy problem in classes of groups which can be written as HNN extension or amalgamated product. The conjugacy problem is one of the fundamental problems in algorithmic group theory which were introduced by Max Dehn in 1911. It poses the question whether two group elements given as words over a fixed set of generators are conjugate. Thus, it is a generalization of the word problem, which asks whether some input word represents the identity. Both, word and conjugacy problem, are undecidable in general. In this thesis, we consider not only decidability, but also complexity of conjugacy. We consider fundamental groups of finite graphs of groups as defined by Serre - a generalization of both HNN extensions and amalgamated products. Another crucial concept for us are strongly generic algorithms - a formalization of algorithms which work for "most" inputs. The following are our main results: The elements of an HNN extension which cannot be conjugated into the base group form a strongly generic set if and only if both inclusions of the associated subgroup into the base group are not surjective. For amalgamated products we prove an analogous result. Following a construction by Stillwell, we derive some undecidability results for the conjugacy problem in HNN extensions with free (abelian) base groups. Next, we show that conjugacy is decidable if all associated subgroups are cyclic or if the base group is abelian and there is only one stable letter. Moreover, in a fundamental group of a graph of groups with free abelian vertex groups, conjugacy is strongly generically in P. Moreover, we consider the case where all edge groups are finite: If conjugacy can be decided in time T(N) in the vertex groups, then it can be decided in time O(log N * T(N)) in the fundamental group under some reasonable assumptions on T (here, N is the length of the input). We also derive some basic transfer results for circuit complexity in the same class of groups. Furthermore, we examine the conjugacy problem of generalized Baumslag-Solitar groups. Our main results are: the conjugacy problem in solvable Baumslag-Solitar groups is TC0-complete, and in arbitrary generalized Baumslag-Solitar groups it can be decided in LOGDCFL. The uniform conjugacy problem for generalized Baumslag-Solitar groups is hard for EXPSPACE. Finally, we deal with the conjugacy problem in the Baumslag group, an HNN extension of the Baumslag-Solitar group BS12. The Baumslag group has a non-elementary Dehn function, and thus, for a long time, it was considered to have a very hard word problem, until Miaskikov, Ushakov, and Won showed that the word problem, indeed, is in P by introducing a new data structure, the so-called power circuits. We follow their approach and show that the conjugacy problem is strongly generically in P. We conjecture that there is no polynomial time algorithm which works for all inputs, because the divisibility problem in power circuits can be reduced to this conjugacy problem. Also, we prove that the comparison problem in power circuits is complete for P under logspace reductions.Item Open Access Neue Methoden und Techniken für die Evaluation von Visualisierungen(2015) Raschke, Michael; Ertl, Thomas (Prof. Dr.)Visualisierungen umgeben uns wie selbstverständlich im Alltag und bei der Arbeit, um abstrakte Informationen darzustellen und komplexe Zusammenhänge zu verstehen. Lag bisher das Hauptaugenmerk der Entwicklung von Visualisierungstechniken auf der Frage, wie möglichst viele Daten in möglichst kurzer Zeit, in einer möglichst hohen Auflösung dargestellt werden können, so gewann in der Visualisierungsforschung in den letzten Jahren die Fragestellung an Bedeutung, ob eine Visualisierung auch nützlich und leicht lesbar ist. Um diese Fragestellung umfassend beantworten zu können, war das Ziel dieser Arbeit die Entwicklung von neuen Methoden und Techniken zur Untersuchung der Wahrnehmung von Visualisierungen, sowie zur Evaluation von Visualisierungstechniken. Dazu wurde ein interdisziplinärer Ansatz gewählt, der die drei wissenschaftlichen Forschungsgebiete Eye-Tracking, Wissensrepräsentation und Kognitionswissenschaften miteinander verbindet. Eye-Tracking-Experimente wurden für die Analyse des Blickverhaltens bei der Arbeit mit Visualisierungen eingesetzt. Die Repräsentation visuellen Wissens erlaubt es, semantische Eigenschaften von Scan-Paths untersuchen zu können. Simulationsmethoden aus den Kognitionswissenschaften ermöglichen es, das Blickverhalten vorherzusagen. Eye-Tracking-Experimente werden in der Visualisierungsforschung dazu eingesetzt, um Augenbewegungen von Probanden, welche Aufgaben mit Visualisierungen durchführen, aufzunehmen. Ein nicht zu unterschätzender Zeitaufwand bei der Auswertung dieser Art von Experimenten nimmt die anschließende Analyse der Augenbewegungen ein. Um den Aufwand der Analyse dieser Scan-Paths zu reduzieren und ähnliche Augenbewegungsmuster über die Probanden hinweg zu identifizieren, wurde die parallele Scan-Path-Visualisierungstechnik entwickelt, die eine übersichtliche Darstellung von mehreren Scan-Paths erlaubt. Damit können Lesestrategien von Visualisierungen über mehrere Probanden hinweg erkannt und miteinander verglichen werden. Die parallele Scan-Path-Visualisierung wurde zusätzlich mit automatischen Mustererkennungsverfahren erweitert. Dieser sogenannten visuelle Analytik-Ansatz erlaubt es, Scan-Paths quantitativ miteinander zu vergleichen und führt zu einer effizienten Analyse von sehr großen Eye-Tracking-Datensätzen. Für die Modellierung von Wissen über Visualisierungen wurde ein Wissensmodell mit drei Ebenen entwickelt. Jede Ebene beschreibt in Form einer Ontologie eine unterschiedliche Abstraktionsebene des Wissens über Visualisierungen und die darin enthaltenen graphischen Elemente. Elemente aus diesen Ontologien werden mit bestimmten Bereichen in einer Visualisierung oder mit einzelnen graphischen Elementen in Visualisierungen verknüpft. Dieser Ansatz ermöglicht es nicht nur wie bisher zu analysieren, welche Bereiche in einer Visualisierung auf einem Bildschirm in welcher Reihenfolge betrachtet worden sind (WO-Raum), sondern auch, was für graphische Elemente dort wahrgenommen (WAS-Raum) und wie diese kognitiv weiterverarbeitet wurden. Es wird gezeigt, wie mit der parallelen Scan-Path-Visualisierungstechnik, basierend auf dieser Annotation, Wissensverarbeitungsprozesse visualisiert werden können. Damit können auch Bereiche in Visualisierungen, die möglicherweise zu einer kognitiven Verzerrung führen, erkannt und im Detail weiter untersucht werden. Für die Simulation der visuellen Suche wurde eine auf dem Kognitionssimulationsframework ACT-R basierende Simulation entwickelt, die Leseprozesse in Visualisierungen simuliert, und es erlaubt, diese mit empirisch ermittelten Daten zu vergleichen. Zusätzlich stellt diese Arbeit erstmalig ein operatorenbasiertes Modell zur Vorhersage von Durchführungszeiten von visuellen Aufgaben vor. Dieses operatorenbasierte Diagram-Viewing-Modell verwendet das Konzept des aus der Mensch-Computer-Interaktionsforschung bekannten Keystroke-Level-Modells und erweitert es für die Vorhersage von Durchführungszeiten von visuellen Aufgaben. Neben einer Effizienzsteigerung bei der Auswertung von Eye-Tracking-Experimenten führt die Kombination der visuellen Analyse von Scan-Paths mit ontologiebasierten Wissensmodellen zu einem tieferen Verständnis der Leseprozesse von Visualisierungen. Semantische Charakteristika von Scan-Paths können besser untersucht werden und die Wahrscheinlichkeit für kognitive Verzerrungen bei der Arbeit mit Visualisierungen durch eine geeignete Anpassung des Visualisierungskonzepts verringert werden. Insgesamt können die in dieser Arbeit vorgestellten Methoden und Techniken zu einem stärker benutzerorientierten, iterativen Entwicklungsprozess von Visualisierungen führen. In diesem Entwicklungsprozess können Ergebnisse der Eye-Tracking-Analyse oder Ergebnisse aus Simulationen dazu eingesetzt werden, um zu untersuchen, wie Visualisierungen von verschiedenen Benutzergruppen wahrgenommen werden.Item Open Access Distributed stream processing in a global sensor grid for scientific simulations(2015) Benzing, Andreas; Rothermel, Kurt (Prof. Dr. rer. nat)With today's large number of sensors available all around the globe, an enormous amount of measurements has become available for integration into applications. Especially scientific simulations of environmental phenomena can greatly benefit from detailed information about the physical world. The problem with integrating data from sensors to simulations is to automate the monitoring of geographical regions for interesting data and the provision of continuous data streams from identified regions. Current simulation setups use hard coded information about sensors or even manual data transfer using external memory to bring data from sensors to simulations. This solution is very robust, but adding new sensors to a simulation requires manual setup of the sensor interaction and changing the source code of the simulation, therefore incurring extremely high cost. Manual transmission allows an operator to drop obvious outliers but prohibits real-time operation due to the long delay between measurement and simulation. For more generic applications that operate on sensor data, these problems have been partially solved by approaches that decouple the sensing from the application, thereby allowing for the automation of the sensing process. However, these solutions focus on small scale wireless sensor networks rather than the global scale and therefore optimize for the lifetime of these networks instead of providing high-resolution data streams. In order to provide sensor data for scientific simulations, two tasks are required: i) continuous monitoring of sensors to trigger simulations and ii) high-resolution measurement streams of the simulated area during the simulation. Since a simulation is not aware of the deployed sensors, the sensing interface must work without an explicit specification of individual sensors. Instead, the interface must work only on the geographical region, sensor type, and the resolution used by the simulation. The challenges in these tasks are to efficiently identify relevant sensors from the large number of sources around the globe, to detect when the current measurements are of relevance, and to scale data stream distribution to a potentially large number of simulations. Furthermore, the process must adapt to complex network structures and dynamic network conditions as found in the Internet. The Global Sensor Grid (GSG) presented in this thesis attempts to close this gap by approaching three core problems: First, a distributed aggregation scheme has been developed which allows for the monitoring of geographic areas for sensor data of interest. The reuse of partial aggregates thereby ensures highly efficient operation and alleviates the sensor sources from individually providing numerous clients with measurements. Second, the distribution of data streams at different resolutions is achieved by using a network of brokers which preprocess raw measurements to provide the requested data. The load of high-resolution streams is thereby spread across all brokers in the GSG to achieve scalability. Third, the network usage is actively minimized by adapting to the structure of the underlying network. This optimization enables the reduction of redundant data transfers on physical links and a dynamic modification of the data streams to react to changing load situations.Item Open Access Visualization challenges in distributed heterogeneous computing environments(2015) Panagiotidis, Alexandros; Ertl, Thomas (Prof. Dr.)Large-scale computing environments are important for many aspects of modern life. They drive scientific research in biology and physics, facilitate industrial rapid prototyping, and provide information relevant to everyday life such as weather forecasts. Their computational power grows steadily to provide faster response times and to satisfy the demand for higher complexity in simulation models as well as more details and higher resolutions in visualizations. For some years now, the prevailing trend for these large systems is the utilization of additional processors, like graphics processing units. These heterogeneous systems, that employ more than one kind of processor, are becoming increasingly widespread since they provide many benefits, like higher performance or increased energy efficiency. At the same time, they are more challenging and complex to use because the various processing units differ in their architecture and programming model. This heterogeneity is often addressed by abstraction but existing approaches often entail restrictions or are not universally applicable. As these systems also grow in size and complexity, they become more prone to errors and failures. Therefore, developers and users become more interested in resilience besides traditional aspects, like performance and usability. While fault tolerance is well researched in general, it is mostly dismissed in distributed visualization or not adapted to its special requirements. Finally, analysis and tuning of these systems and their software is required to assess their status and to improve their performance. The available tools and methods to capture and evaluate the necessary information are often isolated from the context or not designed for interactive use cases. These problems are amplified in heterogeneous computing environments, since more data is available and required for the analysis. Additionally, real-time feedback is required in distributed visualization to correlate user interactions to performance characteristics and to decide on the validity and correctness of the data and its visualization. This thesis presents contributions to all of these aspects. Two approaches to abstraction are explored for general purpose computing on graphics processing units and visualization in heterogeneous computing environments. The first approach hides details of different processing units and allows using them in a unified manner. The second approach employs per-pixel linked lists as a generic framework for compositing and simplifying order-independent transparency for distributed visualization. Traditional methods for fault tolerance in high performance computing systems are discussed in the context of distributed visualization. On this basis, strategies for fault-tolerant distributed visualization are derived and organized in a taxonomy. Example implementations of these strategies, their trade-offs, and resulting implications are discussed. For analysis, local graph exploration and tuning of volume visualization are evaluated. Challenges in dense graphs like visual clutter, ambiguity, and inclusion of additional attributes are tackled in node-link diagrams using a lens metaphor as well as supplementary views. An exploratory approach for performance analysis and tuning of parallel volume visualization on a large, high-resolution display is evaluated. This thesis takes a broader look at the issues of distributed visualization on large displays and heterogeneous computing environments for the first time. While the presented approaches all solve individual challenges and are successfully employed in this context, their joint utility form a solid basis for future research in this young field. In its entirety, this thesis presents building blocks for robust distributed visualization on current and future heterogeneous visualization environments.Item Open Access Position sharing for location privacy in non-trusted systems(2015) Skvortsov, Pavel; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h.c.)Currently, many location-aware applications are available for mobile users of location-based services. Applications such as Google Now, Trace4You or FourSquare are being widely used in various environments where privacy is a critical issue for users. A general solution for preserving location privacy for a user is to degrade the quality of his or her position information. In this work, we propose an approach that uses spatial obfuscation to secure the users’ position information. By revealing the user’s position with a certain degree of obfuscation, the first crucial issue is the tradeoff between privacy and precision. This tradeoff problem is caused by limited trust in the location service providers: higher obfuscation increases privacy but leads to lower quality of service. We overcome this problem by introducing the position sharing approach. Our main idea is that position information is distributed amongst multiple providers in the form of separate data pieces called position shares. Our approach allows for the usage of non-trusted providers and flexibly manages the user’s location privacy level based on probabilistic privacy metrics. In this work, we present the multi-provider based position sharing approach, which includes algorithms for the generation of position shares and share fusion algorithms. The second challenge that must be addressed is that the user’s environmental context can significantly decrease the level of obfuscation. For example, a plane, a boat and a car create different requirements for the obfuscated region. Therefore, it is very important to consider map-awareness in selecting the obfuscated areas. We assume that a static map is known to an adversary, which may help in deriving the user’s true position. We analyze both how map-awareness affects the generation and fusion of position shares and the difference between the map-aware position sharing approach and its open space based version. Our security analysis shows that the proposed position sharing approach provides good security guarantees for both open space and constrained space based models. The third challenge is that multiple location servers and/or their providers may have different trustworthiness from the user’s point of view. In this case, the user would prefer not to reveal an equal level (precision) of position information to every server. We propose a placement optimization approach that ensures that risk is balanced among the location servers according to their individual trust levels. Our evaluation shows significant improvement of privacy guarantees after applying the optimized share distribution, in comparison with the equal share distribution. The fourth related problem is the location update algorithm. A high number of different location servers n (corresponding to n privacy levels) may lead to significant communication overhead. Each update would require n messages from the mobile user to the location servers, especially in cases of high update rate. Therefore, we propose an optimized location update algorithm to decrease the number of messages sent without reducing the number of privacy levels and the user’s privacy.Item Open Access A flexible framework for multi physics and multi domain PDE simulations(2015) Müthing, Steffen; Bastian, Peter (Prof. Dr.)Many important problems in physics and engineering like fluid dynamics and continuum mechanics are modeled using partial differential equations. These problems can typically not be solved directly, but have to be approximated numerically, a challenging process at both the mathematical and the computer science level. In this work, we present a novel set of software components that facilitate the creation of simulation programs for multi domain partial differential equation problems. We identify the implementation challenges related to the coupling of multiple spatial domains and their attached physical problems and develop a mathematical framework of clearly defined building blocks that can be used to compose a multi domain problem by combining single physics building blocks (which are typically already well understood by application scientists) with additional components that describe the interactions between those subproblems. We introduce an open source software implementation of these mathematical concepts on top of the well-established DUNE numerics framework. This implementation consists of two major parts: a mechanism to subdivide any existing DUNE mesh into multiple subdomains, and a set of extensions to the high-level partial differential equation toolbox solver PDELab, which make the components of our mathematical framework available within its solvers. Our overall design enables application-level scientists to reuse existing code blocks from single-physics simulations and combine them to solve new multi domain problems. This new functionality is heavily based on PDELab’s recursive tree representation of product function spaces; we replace the internal ad-hoc implementation of these trees with a new C++ library for statically defined, template-based trees of objects. As multi domain problems typically require structured linear algebra solvers that exploit domain decomposition approaches, we develop a mathematical framework for describing the structure of the vectors and matrices generated during the assembly of a partial differential equation problem based on the structure of the underlying function spaces. This framework is implemented in PDELab; it is based on a tree transformation mechanism provided by our tree library. We demonstrate the versatility of our multi domain simulation components and their impact on developer productivity by means of two model examples; our ultimate goal of simplifying the development of real-world applications is shown by a description of the impact of our software on several external research projects. Finally, we measure the performance impact of our extensions on the existing DUNE framework and discuss the mitigation measures we implemented to reduce any existing performance penalties.Item Open Access Crawling von Enterprise Topologien zur automatisierten Migration von Anwendungen : eine Cloud-Perspektive(2015) Binz, Tobias; Leymann, Frank (Prof. Dr.)Eine schnelle Anpassung der IT an sich ändernde Anforderungen bei gleichzeitiger Reduktion der Kosten bestimmt heute die Konkurrenzfähigkeit einer Organisation. Voraussetzung dafür ist ein technisch detaillierter Einblick in die gesamte IT, also ein Instanzmodell aller Komponenten und deren Beziehungen zueinander. Da Organisationen diese Art der Dokumentation meist nicht durchführen, sind diese IT-Instanzmodelle typischerweise nicht vorhanden, unvollständig oder veraltet. Eine Ursache dafür ist, dass die manuelle Identifikation von Komponenten und deren Beziehungen eine sehr zeitaufwändige, fehleranfällige und somit kostenintensive Aufgabe ist. Neben der Adaption der IT im Allgemeinen erschwert dies auch die Migration von Anwendungen, welche durch den Trend zum Auslagern der IT in die Cloud stark nachgefragt wird. Die Vision dieser Arbeit ist es, einen technisch detaillierten, vollständigen und aktuellen Einblick in die IT zu erlauben und diesen zu nutzen, um die automatisierte Migration von Anwendungen zu ermöglichen. Dafür stellt die vorliegende Arbeit eine Methode zum automatisierten Crawling eines Instanzmodells der gesamten IT einer Organisation vor. Zu dessen Repräsentation, Verwaltung und Verarbeitung wird mit dem Enterprise Topologie Graph (ETG) ein Metamodell eingeführt, das alle Anwendungen, der für deren Betrieb nötigen Komponenten und deren Beziehungen untereinander repräsentiert. ETGs und ihr automatisiertes Crawling erlauben einen umfassenden und vollständigen Einblick in die IT einer Organisation und bilden somit eine solide Grundlage für deren Analyse, Adaption und Optimierung. Darauf aufbauend wird eine Methode zur Migration von Anwendungen (AROMA) entwickelt, die es ermöglicht, von den Vorteilen fortschrittlicher IT-Umgebungen zu profitieren, ohne diese Anwendungen neu entwickeln zu müssen. Nach dem Crawling des ETGs der Ursprungsumgebung wird in der AROMA-Methode die zu migrierende Anwendung extrahiert, transformiert, evaluiert, adaptiert und in der Zielumgebung, zum Beispiel einer Cloud, bereitgestellt. Die Umsetzung der AROMA-Methode mithilfe des OASIS-Standards TOSCA trägt zur Automatisierung der Migration bei und erhält die Funktionalität der Anwendung. Die Forschungsbeiträge und Prototypen werden durch verschiedene Fallstudien validiert und anhand der Aspekte Automatisierung, Korrektheit, Anwendbarkeit, Erweiterbarkeit sowie der Verbesserung der Cloud-Eigenschaften und Portabilität der Anwendung evaluiert.