Universität Stuttgart
Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1
Browse
6 results
Search Results
Item Open Access Visual analysis of large‐scale protein‐ligand interaction data(2021) Schatz, Karsten; Franco‐Moreno, Juan José; Schäfer, Marco; Rose, Alexander S.; Ferrario, Valerio; Pleiss, Jürgen; Vázquez, Pere‐Pau; Ertl, Thomas; Krone, MichaelWhen studying protein‐ligand interactions, many different factors can influence the behaviour of the protein as well as the ligands. Molecular visualisation tools typically concentrate on the movement of single ligand molecules; however, viewing only one molecule can merely provide a hint of the overall behaviour of the system. To tackle this issue, we do not focus on the visualisation of the local actions of individual ligand molecules but on the influence of a protein and their overall movement. Since the simulations required to study these problems can have millions of time steps, our presented system decouples visualisation and data preprocessing: our preprocessing pipeline aggregates the movement of ligand molecules relative to a receptor protein. For data analysis, we present a web‐based visualisation application that combines multiple linked 2D and 3D views that display the previously calculated data The central view, a novel enhanced sequence diagram that shows the calculated values, is linked to a traditional surface visualisation of the protein. This results in an interactive visualisation that is independent of the size of the underlying data, since the memory footprint of the aggregated data for visualisation is constant and very low, even if the raw input consisted of several terabytes.Item Open Access SMARTEN : a sample-based approach towards privacy-friendly data refinement(2022) Stach, Christoph; Behringer, Michael; Bräcker, Julia; Gritti, Clémentine; Mitschang, BernhardTwo factors are crucial for the effective operation of modern-day smart services: Initially, IoT-enabled technologies have to capture and combine huge amounts of data on data subjects. Then, all these data have to be processed exhaustively by means of techniques from the area of big data analytics. With regard to the latter, thorough data refinement in terms of data cleansing and data transformation is the decisive cornerstone. Studies show that data refinement reaches its full potential only by involving domain experts in the process. However, this means that these experts need full insight into the data in order to be able to identify and resolve any issues therein, e.g., by correcting or removing inaccurate, incorrect, or irrelevant data records. In particular for sensitive data (e.g., private data or confidential data), this poses a problem, since these data are thereby disclosed to third parties such as domain experts. To this end, we introduce SMARTEN, a sample-based approach towards privacy-friendly data refinement to smarten up big data analytics and smart services. SMARTEN applies a revised data refinement process that fully involves domain experts in data pre-processing but does not expose any sensitive data to them or any other third-party. To achieve this, domain experts obtain a representative sample of the entire data set that meets all privacy policies and confidentiality guidelines. Based on this sample, domain experts define data cleaning and transformation steps. Subsequently, these steps are converted into executable data refinement rules and applied to the entire data set. Domain experts can request further samples and define further rules until the data quality required for the intended use case is reached. Evaluation results confirm that our approach is effective in terms of both data quality and data privacy.Item Open Access EnzymeML : a data exchange format for biocatalysis and enzymology(2021) Range, Jan; Halupczok, Colin; Lohmann, Jens; Swainston, Neil; Kettner, Carsten; Bergmann, Frank T.; Weidemann, Andreas; Wittig, Ulrike; Schnell, Santiago; Pleiss, JürgenEnzymeML is an XML‐based data exchange format that supports the comprehensive documentation of enzymatic data by describing reaction conditions, time courses of substrate and product concentrations, the kinetic model, and the estimated kinetic constants. EnzymeML is based on the Systems Biology Markup Language, which was extended by implementing the STRENDA Guidelines. An EnzymeML document serves as a container to transfer data between experimental platforms, modeling tools, and databases. EnzymeML supports the scientific community by introducing a standardized data exchange format to make enzymatic data findable, accessible, interoperable, and reusable according to the FAIR data principles. An application programming interface in Python supports the integration of software tools for data acquisition, data analysis, and publication. The feasibility of a seamless data flow using EnzymeML is demonstrated by creating an EnzymeML document from a structured spreadsheet or from a STRENDA DB database entry, by kinetic modeling using the modeling platform COPASI, and by uploading to the enzymatic reaction kinetics database SABIO‐RK.Item Open Access Protecting sensitive data in the information age : state of the art and future prospects(2022) Stach, Christoph; Gritti, Clémentine; Bräcker, Julia; Behringer, Michael; Mitschang, BernhardThe present information age is characterized by an ever-increasing digitalization. Smart devices quantify our entire lives. These collected data provide the foundation for data-driven services called smart services. They are able to adapt to a given context and thus tailor their functionalities to the user’s needs. It is therefore not surprising that their main resource, namely data, is nowadays a valuable commodity that can also be traded. However, this trend does not only have positive sides, as the gathered data reveal a lot of information about various data subjects. To prevent uncontrolled insights into private or confidential matters, data protection laws restrict the processing of sensitive data. One key factor in this regard is user-friendly privacy mechanisms. In this paper, we therefore assess current state-of-the-art privacy mechanisms. To this end, we initially identify forms of data processing applied by smart services. We then discuss privacy mechanisms suited for these use cases. Our findings reveal that current state-of-the-art privacy mechanisms provide good protection in principle, but there is no compelling one-size-fits-all privacy approach. This leads to further questions regarding the practicality of these mechanisms, which we present in the form of seven thought-provoking propositions.Item Open Access MetaConfigurator : a user-friendly tool for editing structured data files(2024) Neubauer, Felix; Bredl, Paul; Xu, Minye; Patel, Keyuriben; Pleiss, Jürgen; Uekermann, BenjaminTextual formats to structure data, such as JSON, XML, and YAML, are widely used for structuring data in various domains, from configuration files to research data. However, manually editing data in these formats can be complex and time-consuming. Graphical user interfaces (GUIs) can significantly reduce manual efforts and assist the user in editing the files, but developing a file-format-specific GUI requires substantial development and maintenance efforts. To address this challenge, we introduce MetaConfigurator : an open-source web application that generates its GUI depending on a given schema. Our approach differs from other schema-to-UI approaches in three key ways: 1) It offers a unified view that combines the benefits of both GUIs and text editors, 2) it enables schema editing within the same tool, and 3) it supports advanced schema features, including conditions and constraints. In this paper, we discuss the design and implementation of MetaConfigurator , backed by insights from a small-scale qualitative user study. The results indicate the effectiveness of our approach in retrieving information from data and schemas and in editing them.Item Open Access Research data management in simulation science : infrastructure, tools, and applications(2024) Flemisch, Bernd; Hermann, Sibylle; Herschel, Melanie; Pflüger, Dirk; Pleiss, Jürgen; Range, Jan; Roy, Sarbani; Takamoto, Makoto; Uekermann, BenjaminResearch Data Management (RDM) has gained significant traction in recent years, being essential to allowing research data to be, e.g., findable, accessible, interoperable, and reproducible (FAIR), thereby fostering collaboration or accelerating scientific findings. We present solutions for RDM developed within the DFG-Funded Cluster of Excellence EXC2075 Data-Integrated Simulation Science (SimTech). After an introduction to the scientific context and challenges faced by simulation scientists, we outline the general data management infrastructure and present tools that address these challenges. Exemplary domain applications demonstrate the use and benefits of the proposed data management software solutions. These are complemented by additional measures for enablement and dissemination to foster the adoption of these techniques.