05 Fakultät Informatik, Elektrotechnik und Informationstechnik
Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6
Browse
25 results
Search Results
Item Open Access Efficient exploratory clustering analyses in large-scale exploration processes(2021) Fritz, Manuel; Behringer, Michael; Tschechlov, Dennis; Schwarz, HolgerClustering is a fundamental primitive in manifold applications. In order to achieve valuable results in exploratory clustering analyses, parameters of the clustering algorithm have to be set appropriately, which is a tremendous pitfall. We observe multiple challenges for large-scale exploration processes. On the one hand, they require specific methods to efficiently explore large parameter search spaces. On the other hand, they often exhibit large runtimes, in particular when large datasets are analyzed using clustering algorithms with super-polynomial runtimes, which repeatedly need to be executed within exploratory clustering analyses. We address these challenges as follows: First, we present LOG-Means and show that it provides estimates for the number of clusters in sublinear time regarding the defined search space, i.e., provably requiring less executions of a clustering algorithm than existing methods. Second, we demonstrate how to exploit fundamental characteristics of exploratory clustering analyses in order to significantly accelerate the (repetitive) execution of clustering algorithms on large datasets. Third, we show how these challenges can be tackled at the same time. To the best of our knowledge, this is the first work which simultaneously addresses the above-mentioned challenges. In our comprehensive evaluation, we unveil that our proposed methods significantly outperform state-of-the-art methods, thus especially supporting novice analysts for exploratory clustering analyses in large-scale exploration processes.Item Open Access Enhancing quasi-newton acceleration for fluid-structure interaction(2022) Davis, Kyle; Schulte, Miriam; Uekermann, BenjaminWe propose two enhancements of quasi-Newton methods used to accelerate coupling iterations for partitioned fluid-structure interaction. Quasi-Newton methods have been established as flexible, yet robust, efficient and accurate coupling methods of multi-physics simulations in general. The coupling library preCICE provides several variants, the so-called IQN-ILS method being the most commonly used. It uses input and output differences of the coupled solvers collected in previous iterations and time steps to approximate Newton iterations. To make quasi-Newton methods both applicable for parallel coupling (where these differences contain data from different physical fields) and to provide a robust approach for re-using information, a combination of information filtering and scaling for the different physical fields is typically required. This leads to good convergence, but increases the cost per iteration. We propose two new approaches - pre-scaling weight monitoring and a new, so-called QR3 filter, to substantially improve runtime while not affecting convergence quality. We evaluate these for a variety of fluid-structure interaction examples. Results show that we achieve drastic speedups for the pure quasi-Newton update steps. In the future, we intend to apply the methods also to volume-coupled scenarios, where these gains can be decisive for the feasibility of the coupling approach.Item Open Access Editorial - special issue on security and privacy in blockchains and the IoT volume II(2023) Stach, Christoph; Gritti, ClémentineItem Open Access SMARTEN : a sample-based approach towards privacy-friendly data refinement(2022) Stach, Christoph; Behringer, Michael; Bräcker, Julia; Gritti, Clémentine; Mitschang, BernhardTwo factors are crucial for the effective operation of modern-day smart services: Initially, IoT-enabled technologies have to capture and combine huge amounts of data on data subjects. Then, all these data have to be processed exhaustively by means of techniques from the area of big data analytics. With regard to the latter, thorough data refinement in terms of data cleansing and data transformation is the decisive cornerstone. Studies show that data refinement reaches its full potential only by involving domain experts in the process. However, this means that these experts need full insight into the data in order to be able to identify and resolve any issues therein, e.g., by correcting or removing inaccurate, incorrect, or irrelevant data records. In particular for sensitive data (e.g., private data or confidential data), this poses a problem, since these data are thereby disclosed to third parties such as domain experts. To this end, we introduce SMARTEN, a sample-based approach towards privacy-friendly data refinement to smarten up big data analytics and smart services. SMARTEN applies a revised data refinement process that fully involves domain experts in data pre-processing but does not expose any sensitive data to them or any other third-party. To achieve this, domain experts obtain a representative sample of the entire data set that meets all privacy policies and confidentiality guidelines. Based on this sample, domain experts define data cleaning and transformation steps. Subsequently, these steps are converted into executable data refinement rules and applied to the entire data set. Domain experts can request further samples and define further rules until the data quality required for the intended use case is reached. Evaluation results confirm that our approach is effective in terms of both data quality and data privacy.Item Open Access Models for internet of things environments : a survey(2020) Franco da Silva, Ana Cristina; Hirmer, PascalToday, the Internet of Things (IoT) is an emerging topic in research and industry. Famous examples of IoT applications are smart homes, smart cities, and smart factories. Through highly interconnected devices, equipped with sensors and actuators, context-aware approaches can be developed to enable, e.g., monitoring and self-organization. To achieve context-awareness, a large amount of environment models have been developed for the IoT that contain information about the devices of an environment, their attached sensors and actuators, as well as their interconnection. However, these models highly differ in their content, the format being used, for example ontologies or relational models, and the domain to which they are applied. In this article, we present a comparative survey of models for IoT environments. By doing so, we describe and compare the selected models based on a deep literature research. The result is a comparative overview of existing state-of-the-art IoT environment models.Item Open Access Solving high-dimensional dynamic portfolio choice models with hierarchical B-splines on sparse grids(2021) Schober, Peter; Valentin, Julian; Pflüger, DirkDiscrete time dynamic programming to solve dynamic portfolio choice models has three immanent issues: firstly, the curse of dimensionality prohibits more than a handful of continuous states. Secondly, in higher dimensions, even regular sparse grid discretizations need too many grid points for sufficiently accurate approximations of the value function. Thirdly, the models usually require continuous control variables, and hence gradient-based optimization with smooth approximations of the value function is necessary to obtain accurate solutions to the optimization problem. For the first time, we enable accurate and fast numerical solutions with gradient-based optimization while still allowing for spatial adaptivity using hierarchical B-splines on sparse grids. When compared to the standard linear bases on sparse grids or finite difference approximations of the gradient, our approach saves an order of magnitude in total computational complexity for a representative dynamic portfolio choice model with varying state space dimensionality, stochastic sample space, and choice variables.Item Open Access Availability analysis of redundant and replicated cloud services with Bayesian networks(2023) Bibartiu, Otto; Dürr, Frank; Rothermel, Kurt; Ottenwälder, Beate; Grau, AndreasDue to the growing complexity of modern data centers, failures are not uncommon any more. Therefore, fault tolerance mechanisms play a vital role in fulfilling the availability requirements. Multiple availability models have been proposed to assess compute systems, among which Bayesian network models have gained popularity in industry and research due to its powerful modeling formalism. In particular, this work focuses on assessing the availability of redundant and replicated cloud computing services with Bayesian networks. So far, research on availability has only focused on modeling either infrastructure or communication failures in Bayesian networks, but have not considered both simultaneously. This work addresses practical modeling challenges of assessing the availability of large‐scale redundant and replicated services with Bayesian networks, including cascading and common‐cause failures from the surrounding infrastructure and communication network. In order to ease the modeling task, this paper introduces a high‐level modeling formalism to build such a Bayesian network automatically. Performance evaluations demonstrate the feasibility of the presented Bayesian network approach to assess the availability of large‐scale redundant and replicated services. This model is not only applicable in the domain of cloud computing it can also be applied for general cases of local and geo‐distributed systems.Item Open Access Data is the new oil - sort of : a view on why this comparison is misleading and its implications for modern data administration(2023) Stach, ChristophCurrently, data are often referred to as the oil of the 21st century. This comparison is not only used to express that the resource data are just as important for the fourth industrial revolution as oil was for the technological revolution in the late 19th century. There are also further similarities between these two valuable resources in terms of their handling. Both must first be discovered and extracted from their sources. Then, the raw materials must be cleaned, preprocessed, and stored before they can finally be delivered to consumers. Despite these undeniable similarities, however, there are significant differences between oil and data in all of these processing steps, making data a resource that is considerably more challenging to handle. For instance, data sources, as well as the data themselves, are heterogeneous, which means there is no one-size-fits-all data acquisition solution. Furthermore, data can be distorted by the source or by third parties without being noticed, which affects both quality and usability. Unlike oil, there is also no uniform refinement process for data, as data preparation should be tailored to the subsequent consumers and their intended use cases. With regard to storage, it has to be taken into account that data are not consumed when they are processed or delivered to consumers, which means that the data volume that has to be managed is constantly growing. Finally, data may be subject to special constraints in terms of distribution, which may entail individual delivery plans depending on the customer and their intended purposes. Overall, it can be concluded that innovative approaches are needed for handling the resource data that address these inherent challenges. In this paper, we therefore study and discuss the relevant characteristics of data making them such a challenging resource to handle. In order to enable appropriate data provisioning, we introduce a holistic research concept from data source to data sink that respects the processing requirements of data producers as well as the quality requirements of data consumers and, moreover, ensures a trustworthy data administration.Item Open Access Effective or predatory funding? : evaluating the hidden costs of grant applications(2022) Dresler, Martin; Buddeberg, Eva; Endesfelder, Ulrike; Haaker, Jan; Hof, Christian; Kretschmer, Robert; Pflüger, Dirk; Schmidt, FabianResearchers are spending an increasing fraction of their time on applying for funding; however, the current funding system has considerable deficiencies in reliably evaluating the merit of research proposals, despite extensive efforts on the sides of applicants, grant reviewers and decision committees. For some funding schemes, the systemic costs of the application process as a whole can even outweigh the granted resources - a phenomenon that could be considered as predatory funding. We present five recommendations to remedy this unsatisfactory situation.Item Open Access Metrics and algorithms for locally fair and accurate classifications using ensembles(2022) Lässig, Nico; Oppold, Sarah; Herschel, MelanieTo obtain accurate predictions of classifiers, model ensembles comprising multiple trained machine learning models are nowadays used. In particular, dynamic model ensembles pick the most accurate model for each query object, by applying the model that performed best on similar data. Dynamic model ensembles may however suffer, similarly to single machine learning models, from bias, which can eventually lead to unfair treatment of certain groups of a general population. To mitigate unfair classification, recent work has thus proposed fair model ensembles , that instead of focusing (solely) on accuracy also optimize global fairness . While such global fairness globally minimizes bias, imbalances may persist in different regions of the data, e.g., caused by some local bias maxima leading to local unfairness . Therefore, we extend our previous work by including a framework that bridges the gap between dynamic model ensembles and fair model ensembles. More precisely, we investigate the problem of devising locally fair and accurate dynamic model ensembles, which ultimately optimize for equal opportunity of similar subjects. We propose a general framework to perform this task and present several algorithms implementing the framework components. In this paper we also present a runtime-efficient framework adaptation that keeps the quality of the results on a similar level. Furthermore, new fairness metrics are presented as well as detailed informations about necessary data preparations. Our evaluation of the framework implementations and metrics shows that our approach outperforms the state-of-the art for different types and degrees of bias present in training data in terms of both local and global fairness, while reaching comparable accuracy.
- «
- 1 (current)
- 2
- 3
- »