Universität Stuttgart

Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1

Browse

Search Results

Now showing 1 - 10 of 293
  • Thumbnail Image
    ItemOpen Access
    Integration von Data Mining und Online Analytical Processing : eine Analyse von Datenschemata, Systemarchitekturen und Optimierungsstrategien
    (2003) Schwarz, Holger; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
    Die technischen Möglichkeiten, Daten zu erfassen und dauerhaft zu speichern, sind heute so ausgereift, dass insbesondere in Unternehmen und anderen Organisationen große Datenbestände verfügbar sind. In diesen Datenbeständen, häufig als Data Warehouse bezeichnet, sind alle relevanten Informationen zu den Organisationen selbst, den in ihnen ablaufenden Prozessen sowie deren Interaktion mit anderen Organisationen enthalten. Vielfach stellt die zielgerichtete Analyse der Datenbestände den entscheidenden Erfolgsfaktor für Organisationen dar. Zur Analyse der Daten in einem Data Warehouse sind verschiedenste Ansätze verfügbar und erprobt. Zwei der wichtigsten Vertreter sind das Online Analytical Processing (OLAP) und das Data Mining. Beide setzen unterschiedliche Schwerpunkte und werden bisher in der Regel weitgehend isoliert eingesetzt. In dieser Arbeit wird zunächst gezeigt, dass eine umfassende Analyse der Datenbestände in einem Data Warehouse nur durch den integrierten Einsatz beider Analyseansätze erzielt werden kann. Einzelne Fragestellungen, die sich aus diesem Integrationsbedarf ergeben werden ausführlich diskutiert. Zu den betrachteten Fragestellungen gehört die geeignete Modellierung der Daten in einem Data Warehouse. Bei der Bewertung gängiger Modellierungsansätze fließen insbesondere die Anforderungen ein, die sich durch den beschriebenen Integrationsansatz ergeben. Als Ergebnis wird ein konzeptuelles Datenmodell vorgestellt, das Informationen in einer Weise strukturiert, die für OLAP und Data Mining gleichermaßen geeignet ist. Im Bereich der logischen Modellierung werden schließlich diejenigen Schematypen identifiziert, die die Integration der Analyseansätze geeignet unterstützen. Im nächsten Schritt sind die für Data Mining und OLAP unterschiedlichen Systemarchitekturen Gegenstand dieser Arbeit. Deren umfassende Diskussion ergibt eine Reihe von Defiziten. Dies führt schließlich zu einer erweiterten Systemarchitektur, die die Schwachstellen beseitigt und die angestrebte Integration geeignet unterstützt. Die erweiterte Systemarchitektur weist eine Komponente zur anwendungsunabhängigen Optimierung unterschiedlicher Analyseanwendungen auf. Ein dritter Schwerpunkt dieser Arbeit besteht in der Identifikation geeigneter Optimierungsansätze hierfür. Die Bewertung der Ansätze wird einerseits qualitativ durchgeführt. Andererseits wird das Optimierungspotenzial der einzelnen Ansätze auch auf der Grundlage umfangreicher Messreihen gezeigt.
  • Thumbnail Image
    ItemOpen Access
    A design space for pervasive advertising on public displays
    (2013) Alt, Florian; Schmidt, Albrecht (Prof. Dr.)
    Today, people living in cities see up to 5000 ads per day and many of them are presented on public displays. More and more of these public displays are networked and equipped with various types of sensors, making them part of a global infrastructure that is currently emerging. Such networked and interactive public displays provide the opportunity to create a benefit for society in the form of immersive experiences and relevant content. In this way, they can overcome the display blindness that evolved among passersby over the years. We see two main reasons that prevent this vision from coming true: first, public displays are stuck with traditional advertising as the driving business model, making it difficult for novel, interactive applications to enter the scene. Second, no common ground exists for researchers or advertisers that outline important challenges. The provider view and audience view need to be addressed to make open, interactive display networks, successful. The main contribution made by this thesis is presenting a design space for advertising on public displays that identifies important challenges -- mainly from a human-computer interaction perspective. Solutions to these core challenges are presented and evaluated, using empirical methods commonly applied in HCI. First, we look at challenges that arise from the shared use of display space. We conducted an observational study of traditional public notice areas that allowed us to identify different stakeholders, to understand their needs and motivations, to unveil current practices used to exercise control over the display, and to understand the interplay between space, stakeholders, and content. We present a set of design implications for open public display networks that we applied when implementing and evaluating a digital public notice area. Second, we tackle the challenge of making the user interact by taking a closer look at attracting attention, communicating interactivity, and enticing interaction. Attracting attention is crucial for any further action to happen. We present an approach that exploits gaze as a powerful input modality. By adapting content based on gaze, we are able to show a significant increase in attention and an effect on the user's attitude. In order to communicate interactivity, we show that the mirror representation of the user is a powerful interactivity cue. Finally, in order to entice interaction, we show that the user needs to be motivated to interact and to understand how interaction works. Findings from our experiments reveal direct touch and the mobile phone as suitable interaction technologies. In addition, these findings suggest that relevance of content, privacy, and security have a strong influence on user motivation. Third, this thesis makes a set of contributions towards understanding audience behavior, which is particularly important for advertisers in order to choose appropriate content and to select suitable locations for future advertising displays. Our findings provide an in-depth understanding of the honeypot effect as a powerful interactivity cue. Furthermore, we identify a number of interesting effects (e.g., the landing effect) and explain how developers could design for them. We envision the results of this thesis to provide a basis for future research and for practitioners to shape future advertisements on public displays in a positive way.
  • Thumbnail Image
    ItemOpen Access
    Scalable computer network emulation using node virtualization and resource monitoring
    (2011) Maier, Steffen Dirk; Rothermel, Kurt (Prof. Dr. rer. nat. Dr. h. c.)
    Ongoing development of computer network technology requires new communication protocols on all layers of the protocol stack to adapt to and to exploit technology specifics. The performance of new protocol implementations has to be evaluated before deployment. Computer network emulation enables the execution of real unmodified protocol implementations within a configurable synthetic environment. Since network properties are reproduced synthetically, emulation supports reproducible measurement results for wired and wireless networks. Meaningful evaluation scenarios typically involve a large number of communicating nodes. Reproducing the network properties of the medium access control layer can be accomplished efficiently on cheap common off the shelf computers and allows to evaluate network protocols, transport protocols, and applications. However, meaningful emulation scenario sizes often require more nodes than affordable computers. To scale the number of nodes in an emulation scenario beyond the available computers, we discuss approaches to virtualization and operating system partitioning. Focusing on the latter, we argue for virtual protocol stacks, which provide an extremely lightweight node virtualization enabling the execution of multiple instances of software to be evaluated on each physical computer. To connect virtual nodes on the same and on different computers, we design and implement a highly efficient software communication switch. A centralized emulation control component distributes dynamic network property updates which result from node mobility for instance. To handle the large number of nodes and thus increased updates, we propose a hierarchical control where the central component delegates updates to sub-components distributed over the computers of an emulation system. Extensive evaluations show the scalability of our virtualized network emulation system. Virtual nodes executed on the same computer share its limited resources. Hosting too many virtual nodes on the same computer may lead to resource contention. This can cause unrealistic measurement results and is thus undesirable. Discussing different approaches to handle resource contention, we argue for detection and recovery. We define quality criteria that allow the detection of resource contention. In order to observe those quality criteria during emulation experiments, we propose a highly lightweight monitoring approach. Our monitoring is based on instrumenting an operating system kernel and observing basic resource scheduling events. This enables the detection of even peak resource usage within a split second. Thorough evaluations demonstrate the effectiveness of quality criteria and monitoring as well as the negligible overhead of our monitoring approach.
  • Thumbnail Image
    ItemOpen Access
    Visualization and mesoscopic simulation in systems biology
    (2013) Falk, Martin Samuel; Ertl, Thomas (Prof. Dr.)
    A better understanding of the internal mechanisms and interplays within a single cell is key to the understanding of life. The focus of this thesis lies on the mechanism of cellular signal transduction, i.e. relaying a signal from outside the cell by different means of transport toward its target inside the cell. Besides experiments, understanding can also be achieved by numerical simulations of cellular behavior which require theoretical models to be designed and evaluated. This is where systems biology closely relates and depends on recent research results in computer science in order to deal with the modeling, the simulation, and the analysis of the computational results. Since a single cell can consist of billions of atoms, the simulation of intracellular processes requires a simplified, mesoscopic model. The simulation domain has to be three dimensional to consider the spatial, possibly asymmetric, intracellular architecture filled with individual particles representing signaling molecules. In contrast to continuous models defined by systems of partial differential equations, a particle-based model allows tracking individual molecules moving through the cell. The overall process of signal propagation usually requires between minutes and hours to complete, but the movement of molecules and the interactions between them have to be determined in the microsecond range. Hence, the computation of thousands of consecutive time steps is necessary, requiring several hours or even days of computational time for a non-parallel simulation. To speed up the simulation, the parallel hardware of current central processing units (CPUs) and graphics processing units (GPUs) can be employed. Finally, the resulting data has to be analyzed by domain experts and, therefore, has to be represented in meaningful ways. Typical prevalent analysis methods include the aggregation of the data in tables or simple 2D graph plots, sometimes 3D plots for continuous data. Despite the fact that techniques for interactive visualization of data in 3D are well-known, so far none of the methods have been applied to the biological context of single cell models and specialized visualizations fitted to the experts’ need are missing. Another issue is the hardware available to the domain experts that can be used for the task of visualizing the increasing amount of time-dependent data resulting from simulations. It is important that the visualization keeps up with the simulations to ensure that domain experts can still analyze their data sets. To deal with the massive amount of data to come, compute clusters will be necessary with specialized hardware dedicated to data visualization. It is, thus, important, to develop visualization algorithms for this dedicated hardware, which is currently available as GPU. In this thesis, the computational power of recent many-core architectures (CPUs and GPUs) is harnessed for both the simulation and the visualizations. Novel parallel algorithms are introduced to parallelize the spatio-temporal, mesoscopic particle simulation to fit the architectures of CPU and GPU in a similar way. Besides molecular diffusion, the simulation considers extracellular effects on the signal propagation as well as the import of molecules into the nucleus and a dynamic cytoskeleton. An extensive comparison between different configurations is performed leading to the conclusion that the usage of GPUs is not always beneficial. For the visual data analysis, novel interactive visualization techniques were developed to visualize the 3D simulation results. Existing glyph-based approaches are combined in a new way facilitating the visualization of the individual molecules in the interior of the cell as well as their trajectories. A novel implementation of the depth of field effect combined with additional depth cues and coloring aid the visual perception and reduce visual clutter. To obtain a continuous signal distribution from the discrete particles, techniques known from volume rendering are employed. The visualization of the underlying atomic structures provides new detailed insights and can be used for educational purposes besides showing the original data. A microscope-like visualization allows for the first time to generate images of synthetic data similar to images obtained in wet lab experiments. The simulation and the visualizations are merged into a prototypical framework, thereby supporting the domain expert during the different stages of model development, i.e. design, parallel simulation, and analysis. Although the proposed methods for both simulation and visualization were developed with the study of single-cell signal transduction processes in mind, they are also applicable to models consisting of several cells and other particle-based scenarios. Examples in this thesis include the diffusion of drugs into a tumor, the detection of protein cavities, and molecular dynamics data from laser ablation simulations, among others.
  • Thumbnail Image
    ItemOpen Access
    Increased flexibility and dynamics in distributed applications and processes through resource decoupling
    (2014) Kipp, Alexander; Resch, Michael (Prof. Dr.-Ing.)
    Continuously increasing complexity of products and services requires more and more specialised expertise as well as relevant support by specialised IT tools and services. However, these services require expert knowledge as well, particularly in order to apply and use these services and tools in an efficient and optimal way. To this end, this thesis introduces a new virtualisation approach, allowing for both, the transparent integration of services in abstract process description languages, as well as the role based integration of human experts in this processes. The developed concept of this thesis has been realised by: - Enhancing the concept of web services with a service virtualisation layer, allowing for the transparent usage, adaptation and orchestration of services - Enhancing the developed concept towards a “Dynamic Session Management” environment, enabling the transparent and role-based integration of human experts following the SOA paradigm - Developing a collaboration schema, allowing for setting up and steering synchronous collaboration sessions between human experts. This enhancement also considers the respective user context and provides the best suitable IT based tooling support. The developed concept has been applied to scientific and economic application fields with a respective reference realisation.
  • Thumbnail Image
    ItemOpen Access
    Investigating dynamics by multilevel phase space discretization
    (2006) Fundinger, Danny Georg; Levi, Paul (Prof. Dr.)
    The subject of the thesis is the numerical investigation of dynamical systems. The aim is to provide approaches for the localization of several topological structures which are of vital importance for the global analysis of dynamical systems, namely, periodic orbits, the chain recurrent set, repellers, attractors and their domains of attraction as well as stable, unstable and connecting manifolds. The techniques introduced do not require any a priori knowledge about a system, and are also not restricted by the stability of the solution. Furthermore, they can generally be applied to a wide range of dynamical systems. Two theoretical concepts are considered to be at the center of the research - symbolic analysis and the RIM method. The underlying basic approach for both of them is multilevel phase space discretization. This means that a part of the phase space, the area of investigation, is subdivided in a finite number of sets. Then, instead of each point of the phase space, only these sets are subject of further analysis. The main target of every method proposed is to find those sets which contain parts of the solution and subdivide them into smaller parts until a desired accuracy is reached. In case of symbolic analysis, a directed graph is constructed which represents the structure of the state space for the investigated dynamical system. This graph is called the symbolic image of the focused system and can be seen as an approximation of the system flow. The theoretical background regarding the symbolic image graph as well as the constructive methods applied on it were already described in a series of works by G. Osipenko. In this work, strategies are introduced for a practical application. This requires the extension of the theoretical concepts and the development of appropriate algorithms and data structures. In practice, it turned out that these aspects are essential cornerstones for the usability of the discussed methods. Also some sophisticated tunings of the basic methods are proposed in order to extent the field of practical investigation. Although symbolic analysis can be seen as the main stimulation of this work, the investigation was not limited to it. Indeed, several shortcomings regarding the solution of some problems can be observed if the method is applied in practice. This led to the development of the RIM method. The core intention of the method is to solve the root finding problem. The standard approach toward this task is the application of an iteration scheme based on the Newton method. However, it has shown that such Newton schemes have several structural disadvantages which are especially crucial in the context of the fields of investigation which are relevant to this work. The RIM method proposes an alternative approach which does not require the application of any Newton-like method. Numerical case studies revealed that in several nontrivial scenarios the RIM method provides better results than both, symbolic analysis as well as Newton-based methods. Two applications of the RIM method for the investigation of dynamical systems are provided. One of them is the detection of periodic points. The other is the computation of stable manifolds. The proposed methods contribute not only to the direct investigation and simulation of specific dynamical processes but also to the research in the field of dynamical system theory in general. This is due to the fact that progress in theory depends to a large extent on the observation and investigation of phenomenons. These phenomenons can often only be revealed, analyzed and verified by numerical experiments. The presented numerical case studies give some concrete examples for the application of the methods. Hereby, the dynamical models are taken from different fields of scientific research, like geography, biology, meteorology, or physics.
  • Thumbnail Image
    ItemOpen Access
    Optimierung datenintensiver Workflows: Konzepte und Realisierung eines heuristischen, regelbasierten Optimierers
    (2011) Vrhovnik, Marko; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
    Um die Modellierung datenintensiver Workflows, die große relationale Datenmengen verarbeiten, zu vereinfachen, wurden Workflowbeschreibungssprachen, wie BPEL, von führenden Herstellern von Workflow- und Datenbankmanagementsystemen um SQL-Funktionalität erweitert. Dadurch müssen Datenverarbeitungsoperationen, wie SQL-Anweisungen oder Aufrufe benutzerdefinierter Prozeduren, nicht mehr in Web-Services gekapselt werden, sondern können direkt auf der Workflowebene definiert werden. Daraus resultiert eine neue Möglichkeit der Anfrageoptimierung, die existierende Optimierungsansätze in Datenbanksystemen ergänzt: Suboptimal modellierte Datenverarbeitungsoperationen lassen sich in einer Workflowbeschreibung unter Verwendung von Restrukturierungsregeln derart transformieren, dass sie von einem Workflow- bzw. Datenbankmanagementsystem wesentlich effizienter ausgeführt werden können. In dieser Doktorarbeit werden Konzepte zur Realisierung eines heuristischen, regelbasierten Optimierers für datenintensive Workflows vorgestellt. Der Optimierer wendet eine Regelbasis gemäß einer wohldefinierten Kontrollstrategie auf eine interne Repräsentation für datenintensive Workflows, dem sogenannten Prozessgraphenmodell (PGM), an, um die Datenverarbeitung eines datenintensiven Workflows zu optimieren. PGM erlaubt eine effiziente und sprachunabhängige Definition und Anwendung der Restrukturierungsregeln und unterstützt somit eine Optimierung von Datenverarbeitungsoperationen, die in unterschiedlichen Beschreibungssprachen definiert sein können. Die Regelbasis enthält Restrukturierungsregeln, die auf existierenden und neuen Optimierungsstrategien beruhen. Insbesondere nutzen die Restrukturierungsregeln das Wissen über Abhängigkeiten in einer Workflowbeschreibung aus, um die darin eingebetteten Datenverarbeitungsoperationen unter Beibehaltung der ursprünglichen Ausführungssemantik eines datenintensiven Workflows zu optimieren. Die Kontrollstrategie bestimmt, welche Restrukturierungsregeln in welcher Reihenfolge auf welche Teile einer Workflowbeschreibung angewendet werden, um zum einen das Optimierungspotential eines datenintensiven Workflows umfassend zu nutzen und zum anderen die Korrektheit der Regelanwendungen sicherzustellen. Die ausführliche Beschreibung des Prozessgraphenmodells, der Regelbasis und der Kontrollstrategie stehen im Mittelpunkt dieser wissenschaftlichen Abhandlung. Des Weiteren wird eine prototypische Implementierung des Optimierungsansatzes vorgestellt, welche dessen praktische Einsatzfähigkeit unterstreicht. Schließlich wird die Effektivität der einzelnen Restrukturierungsregeln mithilfe verschiedener Messszenarien untersucht. Dabei wird gezeigt, dass durch Anwendung der Restrukturierungsregeln Leistungssteigerungen in mehreren Größenordnungen erreicht werden können.
  • Thumbnail Image
    ItemOpen Access
    Decoding strategies for syntax-based statistical machine translation
    (2015) Braune, Fabienne; Maletti, Andreas (Dr.)
    Provided with a sentence in an input language, a human translator produces a sentence in the desired target language. The advances in artificial intelligence in the 1950s led to the idea of using machines instead of humans to generate translations. Based on this idea, the field of Machine Translation (MT) was created. The first MT systems aimed to map input text into the target translation through the application of hand-crafted rules. While this approach worked well for specific language-pairs on restricted fields, it was hardly extendable to new languages and domains because of the huge amount of human effort necessary to create new translation rules. The increase of computational power enabled Statistical Machine Translation (SMT) in the late 1980s, which addressed this problem by learning translation units automatically from large text collections. Statistical machine translation can be divided into several paradigms. Early systems modeled translation between words while later work extended these to sequences of words called phrases. A common point between word and phrase-based SMT is that the translation process takes place sequentially, which is not well suited to translate between languages where words need to be reordered over (potentially) long distances. Such reorderings led to the implementation of SMT systems based on formalisms that allow to translate recursively instead of sequentially. In these systems, called syntax-based systems, the translation units are modeled with formal grammar productions and translation is performed by assembling the productions of these grammars. This thesis contributes to the field of syntax-based SMT in two ways : (i) the applicability of a new grammar formalism is tested by building the first SMT system based on the local local Multi Bottom-Up Tree Transducer (l-MBOT) (ii) new ways to integrate linguistic annotations in the translation model (instead of the grammar rules) of syntax-based systems are developed.
  • Thumbnail Image
    ItemOpen Access
    Efficient programmable deterministic self-test
    (2010) Hakmi, Abdul-Wahid; Wunderlich, Hans-Joachim (Prof. Dr. habil.)
    In modern times, integrated circuits (ICs) are used in almost all electronic equipment ranging from household appliances to space shuttles and have revolutionized the world of electronics. Continuous reductions in the manufacturing costs as well as the size of this technology have allowed the development of very sophisticated ICs for common use. Post fabrication testing is necessary for each IC in order to ensure the quality and the safety of human life. The improvement in technology as well as economies of scale are continuously reducing fabrication costs. On the other hand, the increasing complexity of circuits is leading to higher test costs. These increasing test costs affect the market price of a chip. A test set is a set of binary patterns that are applied on the circuit inputs to detect the potential faults. Only a small number of bits in a test set are specified to 0 or 1 called care bits while other bits called don't care bits may assume random values. Test sets volume is characterized by the number of patterns as well as the size of each pattern in a test set. The increasing number of gates in nanometer ICs has resulted in an explosive increase in test sets volume. This increase in test sets volume is the major cause for rapidly growing test costs. An IC is tested either by using an automatic test equipment (ATE) or with the help of special hardware added on-chip that performs a self-test. These two approaches as well as their hybrid derivatives offer various trade-offs in test costs, quality, reliability and test time. In ATE testing high test sets volume leads to the requirement of expensive testers with large storage capacity while in self-test it results in significant hardware overhead. A test set is highly compressible due to the presence of a large number of don't care bits. The Test data compression techniques are used to limit test sets volume and hence the involved test cost. These compressed test sets are applicable to both ATE and Self-test methodologies. Compression of a test set depends on its statistical attributes such as the percentage and the distribution of care bits. The available test compression schemes assume that all the test sets have similar statistical attributes which is not always true. These attributes vary considerably among various test sets depending on the circuit structure and the targeted trade-offs. To get optimized reduction in test sets volume, test sets with different statistical attributes have to be addressed separately. In this work we analyze various test sets of industrial circuits and categorize them into three classes based on their statistical attributes. By examining each class differently, three novel compression methods and decompression architectures are proposed. The proposed test compression methods are equally adaptable in ATE testing and self-test. Three low cost programmable self-test schemes offering various trade-offs in testing are developed by applying these methods. The experimental results obtained with the test sets of large industrial circuits show that the proposed compression methods reduce storage requirements by more than half compared to the most efficient available methods. First time in literature the total number of bits in a compressed test set are lesser than the number of care bits in the original test set. The additional advantages of proposed methods include guaranteed encoding, significant reduction in decompression time overhead and programmability of decompression hardware.
  • Thumbnail Image
    ItemOpen Access
    A light weighted semi-automatically I/O-tuning solution for engineering applications
    (Stuttgart : Höchstleistungsrechenzentrum, Universität Stuttgart, 2017) Wang, Xuan; Resch, Michael M. (Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. E.h.)
    Today’s engineering applications running on high performance computing (HPC) platforms generate more and more diverse data simultaneously and require large storage systems as well as extremely high data transfer rates to store their data. To achieve high performance data transfer rate (I/O performance), computer scientists together with HPC manufacturers have developed a lot of innovative solutions. However, how to transfer the knowledge of their solutions to engineers and scientists has become one of the largest barriers. Since the engineers and scientists are experts in their own professional areas, they might not be capable of tuning their applications to the optimal level. Sometimes they might even drop down the I/O performance by mistake. The basic training courses provided by computing centers like HLRS seem to be not sufficient enough to transfer the know-how required. In order to overcome this barrier, I have developed a semi-automatically I/O-tuning solution (SAIO) for engineering applications. SAIO, a light weighted and intelligent framework, is designed to be compatible with as many engineering applications as possible, scalable with large engineering applications, usable for engineers and scientists with little knowledge of parallel I/O, and portable across multiple HPC platforms. Standing upon MPI-IO library allows SAIO to be compatible with MPI-IO based high level I/O libraries, such as parallel HDF5, parallel NetCDF, as well as proprietary and open source software, like Ansys Fluent, WRF Model etc. In addition, SAIO follows current MPI standard, which makes it be portable across many HPC platforms and scalable. SAIO, which is implemented as dynamic library and loaded dynamically, does not require recompiling or changing application's source codes. By simply adding several export directives into their job submission scripts, engineers and scientists will be able to run their jobs more efficiently. Furthermore, an automated SAIO training utility keeps the optimal configurations up to date, without any manuell efforts of user involved.