05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
Search Results
Item Open Access Visual prediction of quantitative information using social media data(2017) Fatehi Ebrahimzadeh, HamedIn recent years, the availability of a vast amount of user-generated data via social media, has given an opportunity to researchers for analyzing these data sources and discovering meaningful information. However, processing and understanding this immense amount of data is challenging and calls for automated approaches, and involvement of field experts to use their field knowledge and experience to enhance the data analysis. So far, existing approaches only enable the detection of indicative information from the data such as the occurrence of critical incidents, relevant situation reports etc. Consequently, the next step would be to better relate the user provided information to the real-world quantities. In this work, a predictive visual analytics approach is developed that offers semi-automated methods to estimate quantitative information (e.g. number of people who participate in a public event). At first, the approach provides interactive visual tools to explore social media data in time and space and select features required as input for training and prediction interactively. Next, a suitable model can be trained based on these feature sets and applied for prediction. Finally, the approach also allows to visually explore prediction results and measure quality of predictions with respect to the ground truth information obtained from past observations. The result of this work is a generic visual analytics approach, that provides expert user with visual tools for a constant interaction between human and machine, for producing quantitative predictions based on social media data. The results of predictions are promising, especially in cases that the location, time and other related information to public events are considered together with the content of user-generated data.Item Open Access Integration of IoT devices via a blockchain-based decentralized application(2017) Ahmad, AfzaalBlockchains are shared, immutable ledgers for recording the history of transactions. They foster a new generation of transactional applications that establish trust, accountability, and transparency. It enables contract partners to secure a deal without involving a trusted third party. Initially, the focus was on financial industry for digital assets trading like Bitcoin, but with the emergence of Smart Contracts, blockchain becomes a complete programmable platform. Many research and commercial organization start diving into blockchain world, bringing new ideas of its application in different sectors like supply chain, Health, and autonomous shopping. This thesis presents an idea to integrate Internet of Things (IoT) devices via a blockchain based decentralize application based on Ethereum. The application consists of front-end application which can be deployed to any web server, and a smart contract which will be deployed on a private blockchain network comprises of Peer-to-Peer (P2P) connected IoT devices acting as full Ethereum node. The application emulates the digital transport ticketing system where the asset is a ticket which can be purchased and paid by the user using ether in their Ethereum account on the blockchain. Once the purchase transaction is mined, it is propagated to all the peers. Ticket can now be accessed locally without requesting any centralized system, which makes the system easily accessible and safe because of the security, data integrity and decentralization of the blockchain-based systems.Item Open Access Erweiterung und Evaluation einer lupenbasierten Technik zur Exploration von Textsammlungen(2017) Assenov, IvanIn recent years there has been a sharp increase in the amount of text publicly accessible in digital form. The primary cause for this is widespread access to the Internet, the popularity of e-mail and social networking websites and collaborative efforts to preserve and share knowledge. These developments have inspired the creation of a wide variety of information visualization techniques that focus on large-scale text data and facilitate its exploration and analysis. One popular approach represents individual documents as glyphs on a 2D surface, with pairwise distances corresponding to semantic similarities. The metaphor of a moveable lens that summarizes the contents of texts underneath it has been proposed as a method of interaction targeted at free exploration tasks. The main goal of this master’s thesis project is to extend the basic technique by adding labels to the visualization that guide its users towards regions of interest more quickly without negatively impacting the lens’ usefulness. Also, an automatic framework that determines the tool’s effectiveness under different parameter settings is developed. Finally, the proposed improvements and the overall technique are evaluated by means of a think-aloud user study.Item Open Access Crawling hardware for OpenTOSCA(2017) Choudhury, PushpamHeterogeneity is the essence of the IoT paradigm. There is heterogeneity in communication and transport protocols, in network infrastructure, and even among the interacting devices themselves. Managing discovery of the different devices in such a paradigm is an extremely complex task. The typical solutions include an abstraction layer, commonly known as the middleware layer, that handles this complexity for the devices, thereby, allowing them to interact with one another. One major limitation of the existing middleware solutions is in their ability to allow for an easily configurable approach required to handle the tremendous scale of heterogeneous components in the IoT. The objective of this thesis is to develop such a highly configurable discovery middleware approach. The proposed approach aims to discover a variety of heterogeneous devices and services depending on a multi-level plugin layer, consisting of independent plugins that interact with each other based on the pipes and filters architectural pattern. To allow for the dynamic configuration of the middleware, a discovery configuration is developed. The output from the middleware includes a list of devices and their capabilities and is accessible via a web interface which can interact with a range of different clients. The proposed approach is validated on a scenario in a real-life environment.Item Open Access Interaktive und inkrementelle Visualisierung im Kontext von Big Data(2017) Ast, BirgitStetig wachsende Datenmengen eröffnen Datenanalysten viele neue Chancen zur Gewinnung bislang unbekannten Wissens. Allerdings stellen sie Mensch und Technik auch vor neue Herausforderungen. Auf Grund der Größe der Datenmengen werden Analysen zu langwierigen, unflexiblen Prozessen. Ein Ansatz, um dem entgegenzuwirken, sind inkrementelle Verfahren. Dabei werden während des Analyseprozesses nach und nach Zwischenergebnisse generiert, welche sich letztlich dem Endergebnis annähern. Bei einer inkrementellen, visuellen Datenanalyse können anhand der Entwicklung der Teilergebnisse früh Schlussfolgerungen im Hinblick auf die Gesamtmenge gezogen und entsprechend schnell reagiert werden. Für eine zielführende inkrementelle Analyse ist es wichtig, repräsentative Teilergebnisse zu erhalten sowie deren Aussagekraft richtig einschätzen zu können. Auch eine aktive Einbindung des Analysten in den Visualisierungsprozess ist von Bedeutung. In der vorliegenden Arbeit wird ein Konzept für eine interaktive Webanwendung zur inkrementellen, visuellen Datenanalyse entwickelt. Die Notwendigkeit der genannten Anforderungen wird erläutert und Möglichkeiten zur praktischen Umsetzung beschrieben. Basierend darauf wird ein Prototyp entwickelt, welcher dieses Konzept realisiert.Item Open Access Software Repositories Mining von Issue Tasks und Coupled File Changes(2017) Alakus, DenizVersionsverwaltungssysteme wie Git sind eine große Hilfe bei der Entwicklung von komplexen Softwaresystemen. Issue-Tracker, die neben den Software Repositories verwaltet werden, tragen ebenfalls dazu bei. Die Software Repositories enthalten neben der Entwicklungs-Historie einer Software auch weitere auf den ersten Blick nicht offensichtliche Daten und Muster. Mit Hilfe von Software Repository Mining lassen sich diese aus Software Repositories extrahieren. In dieser Diplomarbeit wurde ein Tool entwickelt und evaluiert, mit der aus Software Repositories und Issue-Tasks Coupled Changes, also Dateien die häufig gemeinsam geändert wurden, extrahiert und angezeigt werden können.Item Open Access Individual characteristics of successful coding challengers(2017) Wyrich, MarvinAssessing a software engineer's problem-solving ability to algorithmic programming tasks has been an essential part of technical interviews at some of the most successful technology companies for several years now. Despite the adoption of coding challenges among these companies, we do not know what influences the performance of different software engineers in solving such coding challenges. We conducted an exploratory study with software engineering students to find hypothesis on what individual characteristics make a good coding challenge solver. Our findings show that the better coding challengers have also better exam grades and more programming experience. Furthermore, conscientious as well as sad software engineers performed worse in our study.Item Open Access Progressive sparse coding for in situ volume visualization(2017) Berian, GratianNowadays High-Performance Computing (HPC) suffer from an ever-growing gap between computational power, I/O bandwidth and storage capacity. Typical runs of HPC simulations produce Terabytes of data every day. This poses a serious problem when it comes to storing and manipulating such high amount of data. In this thesis I will present a method for compressing time-dependent volume data using an overcomplete dictionary learned from the input data. The proposed method comprises of two steps. In the first step the dictionary is learned over a number of training examples extracted from the volume that we want to compress. This process is an iterative one and at each step the dictionary is updated to better sparsely represent the training data. The second step expresses each block of the volume as a sparse linear combination of the dictionary atoms that were trained over that volume. In order to establish the performance of the proposed method different aspects were tested such as: training speed vs sparsifying speed, compression ratio vs reconstruction error, dictionary reusabilty for multiple time steps and how does a dictionary perform when it is used on a different volume than the one it was trained on. Finally we compare the quality of the reconstructed volume to the original volume and other lossy compression techniques in order to have a visual understanding about the quality of the reconstruction.Item Open Access Subspace-optimal data mining on spatially adaptive sparse grids(2017) Luz, MaximilianContinued improvements in technology lead to an ever-growing amount of data generated, for example, by scientific measurements and simulations. Data-mining is required to gain useful knowledge from this data, however, can be challenging especially due to the size and dimensionality of these problems. The use of regular grids for such applications is often limited by the curse of dimensionality, a phrase used to describe an exponential dependency of the computational complexity of a problem on the dimensionality of this problem. For many higher-dimensional problems, e.g. with 28 dimensions, regular grids cannot be used to compute results with the desired accuracy in a reasonable amount of time, even if the memory required to store and process them is available. With spatially adaptive sparse grids, this problem can be overcome, as they lessen the influence of the dimensionality on the size of the grid, furthermore, they have been successfully applied for many tasks, including regression on large data sets. However, the currently preferred and in practice highly performant streaming-algorithm for regression on spatially adaptive sparse grids employs many unnecessary operations to effectively utilize modern parallel computer architectures, such as graphics processing units (GPUs). In this thesis, we show that the implementation of a by computational complexity more promising subspace-linear algorithm on the GPU is able to out-perform the currently preferred streaming-algorithm on many scenarios, even though the this algorithm does not utilize modern architectures as well as the streaming-algorithm. Furthermore, we explore the construction of a new algorithm by combining both, streaming- and subspace-linear algorithm, which aims to process each subgrid of the grid with the algorithm deemed most efficient for its structure. We evaluated both of our algorithms against the highly optimized implementation of the streaming-algorithm provided in the SG++ framework, and could indeed show speed-ups for both algorithms, depending on the experiments.Item Open Access Causal models for decision making via integrative inference(2017) Geiger, Philipp; Toussaint, Marc (Prof. Dr.)Understanding causes and effects is important in many parts of life, especially when decisions have to be made. The systematic inference of causal models remains a challenge though. In this thesis, we study (1) "approximative" and "integrative" inference of causal models and (2) causal models as a basis for decision making in complex systems. By "integrative" here we mean including and combining settings and knowledge beyond the outcome of perfect randomization or pure observation for causal inference, while "approximative" means that the causal model is only constrained but not uniquely identified. As a basis for the study of topics (1) and (2), which are closely related, we first introduce causal models, discuss the meaning of causation and embed the notion of causation into a broader context of other fundamental concepts. Then we begin our main investigation with a focus on topic (1): we consider the problem of causal inference from a non-experimental multivariate time series X, that is, we integrate temporal knowledge. We take the following approach: We assume that X together with some potential hidden common cause - "confounder" - Z forms a first order vector autoregressive (VAR) process with structural transition matrix A. Then we examine under which conditions the most important parts of A are identifiable or approximately identifiable from only X, in spite of the effects of Z. Essentially, sufficient conditions are (a) non-Gaussian, independent noise or (b) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (a) and (b), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X. Still focusing on topic (1) but already including elements of topic (2), we consider the problem of approximate inference of the causal effect of a variable X on a variable Y in i.i.d. settings "between" randomized experiments and observational studies. Our approach is to first derive approximations (upper/lower bounds) on the causal effect, in dependence on bounds on (hidden) confounding. Then we discuss several scenarios where knowledge or beliefs can be integrated that in fact imply bounds on confounding. One example is about decision making in advertisement, where knowledge on partial compliance with guidelines can be integrated. Then, concentrating on topic (2), we study decision making problems that arise in cloud computing, a computing paradigm and business model that involves complex technical and economical systems and interactions. More specifically, we consider the following two problems: debugging and control of computing systems with the help of sandbox experiments, and prediction of the cost of "spot" resources for decision making of cloud clients. We first establish two theoretical results on approximate counterfactuals and approximate integration of causal knowledge, which we then apply to the two problems in toy scenarios.