Browsing by Author "Mitschang, Bernhard (Prof. Dr.-Ing. habil.)"

Now showing 1 - 20 of 29

Open Access
Adaptive und wandlungsfähige IT-Architektur für Produktionsunternehmen
(2014) Silcher, Stefan; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Die Herausforderungen, denen sich Produktionsunternehmen heutzutage stellen müssen, nehmen kontinuierlich zu. Diese umfassen insbesondere die Globalisierung, die wachsende Komplexität und das heute vorherrschende turbulente Umfeld [Jovane 2009]. Durch die Globalisierung muss sich jedes Unternehmen dem Wettbewerb und den vielfältigen Herausforderungen der unterschiedlichen Märkte stellen. Die zunehmende Komplexität wird nicht nur durch eine steigende Anzahl an Produktvarianten hervorgerufen, sondern nimmt auch auf der Prozessebene kontinuierlich zu. Die Probleme vergrößern sich durch das turbulente Umfeld, in dem interne und externe Einflüsse auf die Produktionsunternehmen einwirken und zu einem kontinuierlichen Anpassungsbedarf führen [Westkaemper 2007]. In Produktionsunternehmen wird diesen Herausforderungen zunehmend mittels Informationstechnik (IT) begegnet. Die Vielzahl an Softwaresystemen und deren oft proprietäre Integration führen jedoch schnell zu einer komplexen IT-Landschaft, deren Wartungsaufwand kontinuierlich steigt. Zusätzlich sind sowohl die Softwareanwendungen als auch deren Integration unflexibel [Kirchner 2003], weshalb Änderungen und Erweiterungen nur mit großem Aufwand durchführbar sind. Die in den Anwendungen implementierten Prozesse werden damit ebenfalls starr und können aufgrund dessen nicht schnell genug angepasst werden. Zudem sind Integrationslösungen weitestgehend auf eine Domäne beschränkt und ermöglichen keinen unternehmensweiten Datenaustausch oder domänen- und anwendungsübergreifende Prozessdefinitionen. Aus diesen Gründen wird eine neue IT-Architektur für Produktionsunternehmen benötigt, welche die Adaptivität sowohl der Anwendungen und deren Integration als auch der Prozesse unterstützt. Die vorliegende Arbeit beschreibt eine solche adaptive und wandlungsfähige IT-Architektur (ACITA) für Produktionsunternehmen [Silcher 2011]. Deren initiale Anwendungsdomäne ist der Produktlebenszyklus bzw. das Product Lifecycle Management (PLM), sie kann jedoch relativ einfach auf weitere Domänen ausgeweitet werden. Zur Integration der Anwendungen werden einheitliche und standardisierte Web Service Schnittstellen verwendet. Die lose und damit flexible Kopplung der Services erfolgt über einen angepassten Enterprise Service Bus (ESB). Die Unterstützung der Prozesse geschieht durch flexible Komposition von Services in Workflows, die die Geschäftsprozesse unterstützen können. Jede Domäne wird durch dieses Vorgehen getrennt voneinander über einen angepassten ESB integriert. Dies erlaubt die technischen Anforderungen der Domäne zu berücksichtigen, wodurch eine leistungsfähigere IT-Umgebung erreicht wird. Die Integration der einzelnen domänenspezifischen ESBs erfolgt über einen weiteren ESB, dem sogenannten PLM-Bus. Dieser sorgt für eine wandlungsfähige IT-Architektur, indem phasenspezifische ESBs einfach hinzugefügt oder entfernt werden können. Die Umsetzung der ACITA erfordert eine Reihe verschiedener Komponenten. Die zu integrierenden Anwendungen benötigen Serviceschnittstellen, um auf deren Funktionalität oder Daten zugreifen zu können. Die Verwaltung dieser Serviceschnittstellen wird durch mehrere Serviceverzeichnisse bewerkstelligt, deren Anordnung der Hierarchie der ACITA entsprechen [Silcher 2013a]. Die lose Kopplung der Services erfolgt über Content-based Router (CBR), die in jedem ESB implementiert sind. Die Unabhängigkeit von proprietären Datenformaten der integrierten Anwendungen wird durch die Verwendung von einheitlichen Nachrichtenaustauschformaten in jeder Phase sichergestellt. Um Nachrichten zwischen unterschiedlichen Phasen zu übertragen, sind Übersetzungsservices notwendig. Die prototypische Implementierung der ACITA erfolgte in der Lernfabrik aIE, die aus einer digitalen Lerninsel und einer physischen Modellfabrik besteht [Riffelmacher 2007]. Zur Integration wurden die beiden domänenspezifischen Integrationsumgebungen des Production-planning Service Bus (PPSB) und des Manufacturing Service Bus (MSB) über den PLM-Bus verbunden, um den nahtlosen Datenaustausch zwischen den entsprechenden Phasen zu demonstrieren [Silcher 2013a]. Die Evaluation der ACITA wird in vier Anwendungsszenarien durchgeführt und anhand von sechs Kriterien mit anderen Integrationslösungen für den Produktlebenszyklus verglichen. Die ACITA kann in einem global aufgestellten Unternehmen durch die Integration von verteilten Anwendungen bzw. Services den Datenaustausch zwischen allen Standorten sicherstellen. Der durchgängige Einsatz von Softwaresystemen in Kombination mit der flexiblen Prozessunterstützung macht die wachsende Komplexität der Produkte und Prozesse besser beherrschbar. Der kontinuierliche Anpassungsbedarf, der durch das turbulente Umfeld hervorgerufen wird, ist durch die Adaptivität und Wandlungsfähigkeit der IT-Architektur einfacher durchführbar. Damit sind Unternehmen bestens für zukünftige Herausforderungen gewappnet.
Open Access
Änderungspropagation für autonome und heterogene Informationssysteme
(2011) Heinkel, Uwe; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Heutzutage müssen Unternehmen sich schnell an neue Situationen anpassen. Die Gründe hierfür sind vielfältig: Kundenanforderungen ändern sich, Konkurrenten entwickeln neue Produkte bzw. Strategien oder neue Gesetze werden verabschiedet. Die Anpassungsfähigkeit von Unternehmen wird als Wandlungsfähigkeit bezeichnet. Damit Unternehmen diese Wandlungsfähigkeit erreichen können, müssen sie aus Einheiten bestehen, die weitestgehend autonom sind. Durch die Autonomie wird erreicht, dass Entscheidungen schnell getroffen werden können, weil jede Einheit selbstständig reagieren kann. Die Unternehmenseinheiten wurden im Sonderforschungsbereich 467 "Wandlungsfähige Unternehmensstrukturen für die variantenreiche Serienproduktion", in dessen Rahmen auch diese Arbeit entstand, Leistungseinheiten genannt. Leistungseinheiten brauchen unter anderem eine Unterstützung durch Informationssysteme, welche Informationen bereitstellen und verwalten. Damit sich die Leistungseinheiten an neue Situationen anpassen können, müssen auch deren Informationssysteme so weit wie möglich autonom bleiben. Dennoch muss der Austausch von Daten zwischen den Informationssystemen garantiert sein, da Daten teilweise von vielen verschiedenen Leistungseinheiten und ihren Informationssystemen verwendet werden. Besonders deutlich wird das bei Kundendaten, die oftmals in vielen Unternehmensbereichen bzw. Informationssystemen benötigt werden. Daten, die von mehreren Informationssystemen benötigt und gespeichert werden, liegen oft redundant im Unternehmen und meist in heterogener Form vor. Werden redundante Daten in einem Informationssystem geändert, entsteht ein inkonsistenter Zustand, da an anderer Stelle noch die alten Daten gespeichert sind. Um diese Inkonsistenz zu verhindern, müssen die Informationssysteme integriert und die redundanten Daten synchronisiert werden. Replizierte Datenbanken haben ein ähnliches Problem: es müssen ebenfalls Daten synchronisiert werden. Hier sind die Daten aber meistens homogen und die partizipierenden DBMS sind nicht autonom. Des Weiteren ändern replizierte Datenbanken ihre Daten nur über ihre bereitgestellte Schnittstelle in der Datenschicht, in einem Informationssystem sollten sie hingegen in der Anwendungsschicht geändert werden, weil dort die Anwendungslogik liegt und oftmals wichtige Konsistenzregeln geprüft werden müssen. Um diesen Anforderungen gerecht zu werden, wurde in dieser Arbeit ein XML-basiertes Datenintegrationssystem konzipiert und entwickelt, das Änderungspropagation verwendet, um redundante Daten von Geschäftsobjekten zu synchronisieren. Ein Geschäftsobjekt besteht aus einem oder mehreren Implementierungsobjekten, beispielsweise hat ein Kundenauftrag einen Auftragskopf und mehrere Auftragspositionen. Aufgetretene Änderungen werden in einer sogenannten Änderungsbeschreibung pro Geschäftsobjektänderung propagiert, die alle wichtigen Daten einer Änderung enthält. Besonders wichtig sind die zwei Zustände von Geschäftsobjekten, vor und nach der Änderung, und die Änderungsart (create, update, delete) des Geschäftsobjektes. Die Verwendung von zwei Zuständen ermöglicht die Erkennung der Änderungsarten bei den Implementierungsobjekten sowie die Ermittlung von Änderungsdeltas innerhalb des Integrationssystems. Änderungsbeschreibungen werden entlang von definierten Abhängigkeiten propagiert, die von einem Quellsystem zu mehreren Zielsystemen gehen. Um diese Abhängigkeiten flexibel gestalten zu können, wurde eine XML-basierte Sprache entwickelt, die den Namen XML Propagation Definition Language (XPDL)trägt. Des Weiteren wurde eine XPath-basierte Sprache (Propagation Condition Language, PCL) entworfen, die zustandsübergreifende Bedingungen ermöglicht, um Filter für Abhängigkeiten zu definieren. Besonders wichtige Eigenschaften eines Datenintegrationssystems sind die Einhaltung der Änderungsreihenfolge und die Erkennung von Änderungskonflikten. Beide Punkte wurden in dieser Arbeit umgesetzt. Für die Erkennung von Änderungskonflikten wurde eine zustandsbasierte Methode entwickelt, die eine feingranulare Erkennung von Änderungskonflikten ermöglicht. XPDL und PCL ermöglichen eine weitgehend abstrakte Beschreibung von Änderungspropagationen. Damit können dann recht unterschiedliche Informationssysteme unterstützt werden und auch Drittsysteme, die zusätzliche Daten bereitstellen, eingebunden werden.
Open Access
Ansätze für flexible und fehlertolerante modellgetriebene IoT-Anwendungen in dynamischen Umgebungen
(2024) Del Gaudio, Daniel; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Open Access
Aspekte des Change Management in großen koordinierten Systemverbünden
(2019) Königsberger, Jan; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Diese Arbeit untersucht verschiedene Aspekte von Änderungsvorhaben im Rahmen großer Systemverbünde in serviceorientierten Architekturen (SOA). Entsprechende Änderungsaktivitäten und -prozesse werden unter dem Begriff Change Management zusammengefasst und sind ein Teilbereich der SOA Governance. Die SOA Governance definiert Prozesse und Richtlinien zur Steuerung und Überwachung einer SOA. Änderungsprozesse müssen für jede Änderung an einem Bestandteil eines SOA-Systemverbundes, wie etwa eines Services oder seiner Schnittstellen, durchlaufen werden. Daher ist es essentiell, dass solche Prozesse klar dokumentiert und möglichst schlank gehalten werden. Aufgrund der Komplexität eines Systemverbundes ist die Unterstützung der Änderungsprozesse durch spezialisierte Softwareanwendungen unabdingbar zur effizienten Durchführung von Änderungsvorhaben. Das übergeordnete Ziel dieser Arbeit ist daher die Entwicklung von Methoden und Verfahren zur Unterstützung der Änderungsprozesse und der Governance serviceorientierter Architekturen. Zur Erreichung dieses Ziel liefert die vorliegende Arbeit mehrere Beiträge. Es werden zunächst Möglichkeiten zur Vereinfachung der Integration von neuen Service-Consumern in eine SOA vorgestellt. Hierzu wurde das Konzept der Business Objects plus entwickelt. Dieses zielt auf eine Vereinheitlichung von häufig genutzten Datenobjekten über Domänengrenzen hinweg ab, wodurch die Anbindung von Consumern vereinfacht wird. Einen weiteren Beitrag aus diesem Themenfeld stellt die REST-to-SOAP-Middleware Architecture dar. Sie ermöglicht die Anbindung von existierenden, klassischen SOAP-basierten Webservices in Anwendungsfällen, die das leichtgewichtigere REST-Architekturparadigma nutzen. Durch den Einsatz moderner Technologien können sich neue Möglichkeiten bei der Entwicklung von Softwaresystemen eröffnen. Konkret untersucht diese Arbeit dazu die Einsatzmöglichkeiten semantischer Technologien in der Entwicklung eines SOA-Governance-Informationssystems, das Stakeholdern einer SOA eine effiziente Erledigung ihrer Aufgaben ermöglichen soll. Ein weiterer wichtiger Themenkomplex ist die Durchführung von Software- und Schnittstellentests im Rahmen eines Änderungsprozesses. Insbesondere in einem Systemverbund sind dabei eine Vielzahl an Abhängigkeiten zwischen Systemen und Services zu beachten. Diese Arbeit liefert dazu eine Methode zur Risikobewertung von Änderungen, wodurch eine zielgerichtete und ressourcenschonende Testdurchführung ermöglicht wird. Zur Unterstützung der Testplanung und -durchführung wurde in einem weiteren Beitrag ein Konzept zur automatischen Generierung und Optimierung von Testzeitplänen entwickelt, welches existierende Abhängigkeiten und Randbedingungen mit einbezieht, durch die eine manuelle Erstellung eines solchen Zeitplans komplex wäre. Die in dieser Arbeit entwickelten Methoden und Konzepte wurden als Prototyp eines SOA-Governance-Informationssystems, dem SOA Governance Repository implementiert, das ebenfalls vorgestellt wird.
Open Access
Business Impact Analysis - Konzept und Realisierung einer ganzheitlichen Geschäftsanalyse
(2011) Radeschütz, Sylvia Natalie; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Immer mehr Unternehmen setzen auf die Etablierung von Geschäftsprozessen, die verschiedene Unternehmensanwendungen integrieren, anstatt die einzelnen Geschäftsfunktionen separat für sich auszuführen. Geschäftsprozesse werden hierzu als Workflows formalisiert und modelliert und auf einem Workflowmanagementsystem rechnergestützt zur Ausführung gebracht. Auf diese Weise können die Unternehmen schnell und möglichst noch vor der Konkurrenz auf neue Marktsituationen reagieren und ihre Produkte bzw. Geschäftsprozesse z.B. auf neue Kundenwünsche anpassen. Für eine bestmögliche Anpassung der Workflows an neue Anforderungen und eine wirkungsvolle Optimierung der Geschäftsprozesse ist eine genaue Analyse der Ablaufdaten erforderlich. Die Analyse umfasst neben den Zeitmessungen des Ablaufs u.a. auch die von den Geschäftsprozessen referenzierten Dateneingaben und Datenausgaben. Um die Workflows jedoch wirklich wettbewerbsfähig zu halten, fehlt ein entscheidender Aspekt bei heutigen Analyseansätzen: die Berücksichtigung von Informationen aus anderen Unternehmensanwendungen, die zwar nicht im Workflow selbst integriert wurden, aber dennoch wertvolle Daten für wichtige Modellierungsentscheidungen enthalten. Solche ganzheitlichen Geschäftsprozessoptimierungen sind heutzutage nur mit Hilfe von großen manuellen Anstrengungen in der Integration und Analyse der Informationen möglich. Die Durchführung einer ganzheitlichen Optimierung birgt demnach zahlreiche neue Probleme in diesen Bereichen, die zunächst gelöst werden müssen: Das semiautomatische Matchen der Geschäftsprozessdaten mit den operativen Daten aus anderen Unternehmensanwendungen und die Umsetzung ihrer gemeinsamen Analyse mit Hilfe von neuen Verfahren. In dieser Arbeit werden Konzepte zur Realisierung von Lösungen zu der Integration und der ganzheitlichen Analyse dieser Daten vorgestellt. Diese Art der umfassenden Unternehmensanalyse wird hier Business Impact Analysis (BIA) genannt. Die Integration für BIA steht im Mittelpunkt dieser Arbeit. Das Integrationsverfahren wendet eine Matchregeln gemäß einer wohldefinierten Kontrollstrategie, die die optimale Reihenfolge der Regelanwendungen bestimmt, auf die Workflowdaten und operativen Daten an, um sie miteinander auf Schemaebene zu kombinieren. Bei jeder Kombination wird ein Ähnlichkeitswert zwischen den kombinierten Daten berechnet. Wenn der Ähnlichkeitswert über einem durch den Benutzer festgelegten Schwellwert liegt, wird die Kombination der Daten als neuer Match in die Ergebnismenge aufgenommen. Die Regelbasis enthält semi-automatische Regeln für das Kombinieren von Daten, die zuvor durch semantische Begriffe aus einer Ontologie annotiert wurden. Des Weiteren gehören automatische Regeln zur Regelbasis für das Matchen von nicht oder nur teilweise annotierten Daten. Außerdem sind Filterregeln in der Regelbasis dafür zuständig, die berechneten Ähnlichkeitswerte gemäß der Übereinstimmung in der Struktur der jeweiligen Kombinationselemente anzupassen. Die Kombinations- und Filterregeln beruhen auf existierenden und neuen Matchverfahren. Die Analyse für BIA in dieser Arbeit setzt die Erstellung eines integrierten Data Warehouses voraus. Diese Arbeit stellt zwei Architekturmodelle vor, um solch ein Warehouse aufbauen zu können und gibt eine detaillierte Einschätzung ihrer Anwendbarkeit für BIA. Die Analyseverfahren für BIA können auf beiden Architekturansätzen des Warehouses durchgeführt werden. Die Arbeit entwickelt SQL-Analyseoperatoren, um die integrierten Daten in OLAP-Anfragen so zu evaluieren, dass sich neue Erkenntnisse für die Geschäftsprozessoptimierung ergeben. Außerdem gibt die Arbeit Anhaltspunkte, wie die Operatoren im Mining-Verfahren für BIA verwendet werden können. Die Integrations-, Warehousing- und Analyseansätze sind in einer prototypischen Implementierung umgesetzt. In Experimenten wird die Effektivität der Matchregeln untersucht und der Gewinn in der Menge und Qualität der Matchergebnisse gegenüber existierenden Verfahren erläutert. Weitere Messszenarien zeigen die Benutzbarkeit der Warehousing- und Analyseansätze für BIA. Insgesamt lässt sich zeigen, dass mit dieser Arbeit durch die Anwendung der Integrationsregeln und Analyseoperatoren auf dem integrierten Data Warehouse eine solide Grundlage geschaffen werden kann, auf der es möglich ist, eine gewinnbringende ganzheitliche Geschäftsprozessoptimierung durchführen zu können.
Open Access
Concepts and methods for the design, configuration and selection of machine learning solutions in manufacturing
(2021) Villanueva Zacarias, Alejandro Gabriel; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
The application of Machine Learning (ML) techniques and methods is common practice in manufacturing companies. They assign teams to the development of ML solutions to support individual use cases. This dissertation refers as ML solution to the set of software components and learning algorithms to deliver a predictive capability based on available use case data, their (hyper) paremeters and technical settings. Currently, development teams face four challenges that complicate the development of ML solutions. First, they lack a formal approach to specify ML solutions that can trace the impact of individual solution components on domain-specific requirements. Second, they lack an approach to document the configurations chosen to build an ML solution, therefore ensuring the reproducibility of the performance obtained. Third, they lack an approach to recommend and select ML solutions that is intuitive for non ML experts. Fourth, they lack a comprehensive sequence of steps that ensures both best practices and the consideration of technical and domain-specific aspects during the development process. Overall, the inability to address these challenges leads to longer development times and higher development costs, as well as less suitable ML solutions that are more difficult to understand and to reuse. This dissertation presents concepts to address these challenges. They are Axiomatic Design for Machine Learning (AD4ML), the ML solution profiling framework and AssistML. AD4ML is a concept for the structured and agile specification of ML solutions. AD4ML establishes clear relationships between domain-specific requirements and concrete software components. AD4ML specifications can thus be validated regarding domain expert requirements before implementation. The ML solution profiling framework employs metadata to document important characteristics of data, technical configurations, and parameter values of software components as well as multiple performance metrics. These metadata constitute the foundations for the reproducibility of ML solutions. AssistML recommends ML solutions for new use cases. AssistML searches among documented ML solutions those that better fulfill the performance preferences of the new use case. The selected solutions are then presented to decision-makers in an intuitive way. Each of these concepts was evaluated and implemented. Combined, these concepts offer development teams a technology-agnostic approach to build ML solutions. The use of these concepts brings multiple benefits, i. e., shorter development times, more efficient development projects, and betterinformed decisions about the development and selection of ML solutions.
Open Access
Constraints and triggers to enhance XML-based data integration systems
(2009) Lu, Jing; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
XML is becoming one of the main technological integredients of the Internet. It is now accepted as the standard for information exchange. XML-based data integration system, which enables sharing and cooperation with legacy data sources, arises as a more and more important data service provider on the web. These services can provide the users with a uniform interface to a multitude of data sources such as relational databases, XML files, text files, delimited files, Excel files, etc. Users can thus focus on what they want, rather than think about how to obtain the answers. Therefore, users do not have to carry on the tedious tasks such as finding the relevant data sources, interacting with each data source in isolation using the local interface and combining data from multiple data sources. Users are always expecting better query performance and data consistency from the data integration systems. This work proposes an approach to support constraints and triggers in the XML-based data integration system in order to optimize queries and to enforce data consistency. Constraints and triggers have long been recognized to be useful in semantic query optimization and data consistency enforcement in relational databases. This work first gives an approach to use constraints from the heterogeneous data sources to semantically optimize queries submitted to the XML-based data integration system. Different constraints from the data sources are first integrated into a uniform constraint model. Then the constraints in the uniform constraint model are stored in the constraint repository. Traditional semantic query optimization techniques in the relational database are analyzed and three of them are reused and applied by the semantic query optimizer for XML-based data integration system. Among them are detection of empty results, join elimination and predicate elimination. Performance is analyzed according to the data source type and the data volume. The semantic query optimizer works best when the data sources are non-relational, the data volume is huge and the execution cost is expected to be high. In order to make the XML-based data integration system fully equipped with data manipulation capabilities, programming frameworks which support update at the integration level are being developed. This work discusses how to realize update in the XML-based data integration system under the Service Data Objects programming framework. When the user is permitted to submit updates, it is necessary to guarantee data integrity and enforce active business logics in the data integration system. This work presents an approach by which active rules including integrity constraints are enforced by XQuery triggers. An XQuery trigger model in conformance to XQuery update model proposed by W3C is defined. How to define active rules and integrity constraints by XQuery triggers is discussed. Triggers and constraints are stored in the trigger repository. The architecture supporting XQuery trigger service in the XML-based data integration system is proposed. Important components including event detection, trigger scheduling, condition evaluation, action firing and trigger termination are discussed. The whole XQuery trigger service architecture above a data integration system is implemented in BEA AquaLogic DataService Platform under the Service Data Objects programming framework. Experiments show active rules and integrity constraints are enforced easily, efficiently and conveniently at the global level. Constraints and triggers play an important role in XML-based data integration systems. Using constraints and triggers in the XML-based data integration system we can efficiently improve query performance and enforce data consistency.
Open Access
Data provisioning in simulation workflows
(2017) Reimann, Peter; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Computer-based simulations become more and more important, e.g., to imitate real-world experiments such as crash tests, which would otherwise be too expensive or not feasible at all. Thereby, simulation workflows may be used to control the interaction with simulation tools performing necessary numerical calculations. The input data needed by these tools often come from diverse data sources that manage their data in a multiplicity of proprietary formats. Hence, simulation workflows additionally have to carry out many complex data provisioning tasks. These tasks filter and transform heterogeneous input data in such a way that underlying simulation tools can properly ingest them. Furthermore, some simulations use different tools that need to exchange data between each other. Here, even more complex data transformations are needed to cope with the differences in data formats and data granularity as they are expected by involved tools. Nowadays, scientists conducting simulations typically have to design their simulation workflows on their own. So, they have to implement many low-level data transformations that realize the data provisioning for and the data exchange between simulation tools. In doing so, they waste time for workflow design, which hinders them to concentrate on their core issue, i.e., the simulation itself. This thesis introduces several novel concepts and methods that significantly alleviate the design of the complex data provisioning in simulation workflows. Firstly, it addresses the issue that most existing workflow systems offer multiple and diverse data provisioning techniques. So, scientists are frequently overwhelmed with selecting certain techniques that are appropriate for their workflows. This thesis discusses how to conquer the multiplicity and diversity of available techniques by their systematic classification. The resulting classes of techniques are then compared with each other considering relevant functional and non-functional requirements for data provisioning in simulation workflows. The major outcome of this classification and comparison is a set of guidelines that assist scientists in choosing proper data provisioning techniques. Another problem with existing workflow systems is that they often do not support all kinds of data resources or data management operations required by concrete computer-based simulations. So, this thesis proposes extensions of conventional workflow languages that offer a generic solution to data provisioning in arbitrary simulation workflows. These extensions allow for specifying any data management operation that may be described via the query or command languages of involved data resources, e.g., arbitrary SQL statements or shell commands. The proposed extensions of workflow languages still do not remove the burden from scientists to specify many complex data management operations using low-level query and command languages. Hence, this thesis introduces a novel pattern-based approach that even further enhances the abstraction support for simulation workflow design. Instead of specifying many workflow tasks, scientists only need to select a small number of abstract patterns to describe the high-level simulation process they have in mind. Furthermore, scientists are familiar with the parameters to be specified for the patterns, because these parameters correspond to terms or concepts that are related to their domain-specific simulation methodology. A rule-based transformation approach offers flexible means to finally map high-level patterns onto executable simulation workflows. Another major contribution is a pattern hierarchy arranging different kinds of patterns according to clearly distinguished abstraction levels. This facilitates a holistic separation of concerns and provides a systematic framework to incorporate different kinds of persons and their various skills into workflow design, e.g., not only scientists, but also data engineers. Altogether, the pattern-based approach conquers the data complexity associated with simulation workflows, which allows scientists to concentrate on their core issue again, namely on the simulation itself. The last contribution is a complementary optimization method to increase the performance of local data processing in simulation workflows. This method introduces various techniques that partition relevant local data processing tasks between the components of a workflow system in a smart way. Thereby, such tasks are either assigned to the workflow execution engine or to a tightly integrated local database system. Corresponding experiments revealed that, even for a moderate data size of about 0.5 MB, this method is able to reduce workflow duration by nearly a factor of 9.
Open Access
Deep Business Optimization : concepts and architecture for an analytical business process optimization platform
(2015) Niedermann, Florian; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Businesses find themselves today in a highly demanding world: The proliferation of Information Technology combined with highly globalized markets and increasingly distributed value creation has created an environment of unprecedented volatility and complexity. To be able to compete in such an environment, businesses need to be able to rapidly adapt both their service and product offerings and continuously improve their internal efficiency. For many businesses, this implies the need to be able to continuously refine and optimize their business processes. The ability to efficiently and effectively optimize business processes has hence become critical success factor for many businesses and industries. This challenge has been well-recognized already in the 1990s under the umbrella term of Business Process Reengineering (BPR). To overcome the then-dominant split into functional silos, BPR advised business executives to engage in large-scale, revolutionary process changes - usually taking a clean-sheet approach to process design that designed the target state regardless of the status quo. While this approach has been successful in many situations, it has also proven to be highly risky and associated with significant implementation cost. Further, a reengineering project can take considerably longer to implement than today's business cycles would allow. Hence, both business and research have over the past years shifted their focus towards more gradual, evolutionary process optimization methodologies. Compared to revolutionary, clean-sheet process optimization, tool support is much more important in evolutionary optimization. As the status quo is taken as the starting point, the success of the optimization is contingent on understanding it as well as possible and hence depends on three optimization capabilities: First, the optimization needs to take into account as much data about the process and its context as possible. Second, the analyst conducting the optimization needs to thoroughly analyse the data and discover core, often non-obvious, insights that are relevant to the optimization goals. Third, the analyst needs to translate these insights into concrete optimizations of the process that are ideally based on best practices in the respective application domain. In practice, businesses often struggle to excel in these three capabilities, which is at least partially attributable to the (lack of) tool support for evolutionary optimization: Current tools typically offer none or insufficient data integration capabilities, possess only limited analysis support and leave it up to the subjective abilities and judgement of the analyst to spot and properly apply optimizations. As a result, optimization is often both inefficient, i.e., takes longer and is more costly than necessary and ineffective, i.e., does not yield the full potential with regards to the optimization goals. To address this challenge, this thesis presents the deep Business Optimization Platform (dBOP) that combines data integration and advanced analysis capabilities with formalized optimization best practices (so-called patterns) to enhance both the efficiency and effectiveness of Business Process Optimization (BPO): The deep Data Integration (dDI) layer of the dBOP integrates flow-oriented process execution data with subject-oriented operational data sources. While the data integration layer greatly builds on existing schema and data integration techniques, it utilizes its own set of matching rules that take advantage of the specific properties of process data (such as the propagation of matches through assignment between different variables). The deep Business Process Analytics (dBPA) layer builds on the integrated data layer and generates optimization-relevant insights through the computation of key metrics and the application of data mining techniques. The dBPA layer of the dBOP manages to make the application of data mining both powerful and accessible to novice users by tying data mining techniques to certain optimization use cases, effectively reducing required user inputs to a minimum. The results of the dBPA layer are stored in the so-called Process Insight Repository (PIR), a process repository that augments the process model with optimization-relevant insights. In doing so, optimization results can be shared between analysts and accessed in different contexts. Finally, the deep Business Process Optimization (dBPO) layer combines the insights contained in the PIR with formalized optimization best practices and a comprehensive execution strategy to present the analyst with concrete optimization proposals, including a preview of the expected effects. For these proposals that the analyst confirms, the dBOP automatically and correctly rewrites the process. Next to introducing the main concepts of the dBOP, the thesis provides a rigorous evaluation of its capabilities through a qualitative case study, its prototypical implementation and finally, an empirical experiment involving 24 graduate students applying the dBOP to a set of different optimization tasks. Particularly the empirical experiment highlights that the dBOP is a viable approach to increasing the efficiency and effectiveness of evolutionary BPO. Finally, it discusses currently ongoing extensions and future work of the dBOP approach - both within the scope of classical business processes and in the application to other domains, such as manufacturing.
Open Access
Deletion of content in large cloud storage systems
(2017) Waizenegger, Tim; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
This thesis discusses the practical implications and challenges of providing secure deletion of data in cloud storage systems. Secure deletion is a desirable functionality to some users, but a requirement to others. The term secure deletion describes the practice of deleting data in such a way, that it can not be reconstructed later, even by forensic means. This work discuss the practice of secure deletion as well as existing methods that are used today. When moving from traditional on-site data storage to cloud services, these existing methods are not applicable anymore. For this reason, it presents the concept of cryptographic deletion and points out the challenge behind implementing it in a practical way. A discussion of related work in the areas of data encryption and cryptographic deletion shows that a research gap exists in applying cryptographic deletion in an efficient, practical way to cloud storage systems. The main contribution of this thesis, the Key-Cascade method, solves this issue by providing an efficient data structure for managing large numbers of encryption keys. Secure deletion is practiced today by individuals and organizations, who need to protect the confidentiality of data, after it has been deleted. It is mostly achieved by means of physical destruction or overwriting in local hard disks or large storage systems. However, these traditional methods ofoverwriting data or destroying media are not suited to large, distributed, and shared cloud storage systems. The known concept of cryptographic deletion describes storing encrypted data in an untrusted storage system, while keeping the key in a trusted location. Given that the encryption is effective, secure deletion of the data can now be achieved by securely deleting the key. Whether encryption is an acceptable protection mechanism, must be decided either by legislature or the customers themselves. This depends on whether cryptographic deletion is done to satisfy legal requirements or customer requirements. The main challenge in implementing cryptographic deletion lies in the granularity of the delete operation. Storage encryption providers today either require deleting the master key, which deletes all stored data, or require expensive copy and re-encryption operations. In the literature, a few constructions can be found that provide an optimized key management. The contributions of this thesis, found in the Key-Cascade method, expand on those findings and describe data structures and operations for implementing efficient cryptographic deletion in a cloud object store. This thesis discusses the conceptual aspects of the Key-Cascade method as well as its mathematical properties. In order to enable production use of a Key-Cascade implementation, it presents multiple extensions to the concept. These extensions improve the performance and usability and also enable frictionless integration into existing applications. With SDOS, the Secure Delete Object Store, a working implementation of the concepts and extensions is given. Its design as an API proxy is unique among the existing cryptographic deletion systems and allows integration into existing applications, without the need to modify them. The results of performance evaluations, conducted with SDOS, show that cryptographic deletion is feasible in practice. With MCM, the Micro Content Management system, this thesis also presents a larger demonstrator system for SDOS. MCM provides insight into how SDOS can be integrated into and deployed as part of a cloud data management application.
Open Access
The Enterprise Data Marketplace : a platform for democratizing company data
(2024) Eichler, Rebecca; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
In the era of big data, multitudes of data are generated and collected in companies. This data contains a potential value that can be leveraged to gain new insights, e.g., for enhancing business models or reengineering industrial products. Extracting data value requires that this data is available for use. Yet, studies show that significant amounts of the data remain unused in companies. In this regard, data democratization initiatives, with the goal of empowering company employees to find, understand, access, use, and share data, are gaining in importance. Towards this end, data marketplaces are studied as metadata-based platforms to facilitate the exchange and provisioning of data and data-related services. However, data marketplaces are mainly investigated for the exchange of data and services between organizations or private individuals, i.e., as external data marketplaces. Little research focuses on the use of data marketplaces in the company-internal context, i.e., as an Enterprise Data Marketplace (EDMP). Topics of how the EDMP differs from an external marketplace, the scope of its offerings and functionality, challenges that arise in the company-internal context, or how the EDMP can be embedded in and leverage the existent company system and tool landscape have not been investigated in detail thus far. In this thesis, the Enterprise Data Marketplace is examined as a platform for democratizing data in companies, and in this context, the above-listed gaps are addressed. To this end, four research goals (RGs) are put forward: (RG1) the identification of the processes and challenges company employees face in finding, understanding, accessing, and sharing data in the enterprise without an EDMP; (RG2) the identification of the distinctive aspects of an EDMP; (RG3) establishing an architectural foundation for building an EDMP; and (RG4) the goal of leveraging existent metadata in the company tool and system landscape in the EDMP. The research goals are covered through nine research contributions. These entail the current data provider and data consumer journeys and challenges, an EDMP type distinction based on an EDMP definition, as well as a presentation of its distinctive characteristics, requirements, and challenges. An enterprise integration and platform architecture, together with an approach for leveraging existent metadata, yields the foundation for building an EDMP. The feasibility of the concepts put forward in this thesis is demonstrated through an EDMP prototype and an evaluation based on an experiment and qualitative assessment. The evaluation yields that the EDMP is well-suited for the effective realization of data democratization within companies and that it not only addresses several of the current issues data providers and consumers face but also increases the efficiency and reduces the complexity in accessing data. This thesis therefore introduces the EDMP as a platform for democratizing company data and lays the foundation for establishing the EDMP within a company.
Open Access
Flexible processing of streamed context data in a distributed environment
(2014) Cipriani, Nazario; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Nowadays, stream-based data processing occurs in many context-aware application scenarios, such as in context-aware facility management applications or in location-aware visualization applications. In order to process stream-based data in an application-independent manner, Data Stream Processing Systems (DSPSs) emerged. They typically translate a declarative query to an operator graph, place the operators on stream processing nodes and execute the operators to process the streamed data. Context-aware stream processing applications often have different requirements although relying on the same processing principle, i.e. data stream processing. These requirements exist because context-aware stream processing applications differ in functional and operational behavior as well as their processing requirements. These facts are challenging on their own. As a key enabler for the effcient processing of streamed data the DSPS must be able to integrate this speciVc functionality seamlessly. Since processing of data streams usually is subject to temporal aspects, i.e. they are time critical, custom functionality should be integrated seamlessly in the processing task of a DSPS to prevent the formation of isolated solutions and to support exploitation of synergies. Depending on the domain of interest, data processing often depends on highly domain-specific functionalities, e.g. for the application of a location-aware visualization pipeline displaying a three-dimensional map of its surroundings. The application runs on a mobile device and consists of many interconnected operations that form a network of operators called stream processing graph (SP graph). First, the friends’ locations must be collected and connected to their public profile. However, to enable the application to run smoothly for some parts of data processing the presence of a Graphics Processing Unit (GPU) is mandatory. To solve that challenge, we have developed concepts for a flexible DSPS that allows the integration of specific functionality to enable a seamless integration of applications into the DSPS. Therefore, an architecture is proposed. A DSPS based on this architecture can be extended by integrating additional operators responsible for data processing and services realizing additional interaction patterns with context-aware applications. However, this specific functionality is often subject to deployment and run time constraints. Therefore, an SP graph model has been developed which reWects these constraints by allowing to annotate the graph by constraints, e.g. to constrain the execution of operators to only certain processing nodes or specify that the operator necessitates a GPU. The data involved in the processing steps is often subject to restrictions w.r.t the way it is accessed and processed. Users participating in the process might not want to expose their current location to potentially unknown parties, restricting e.g. data access to known ones only. Therefore, in addition to the Wexible integration of specialized operators security aspects must also be considered, limiting the access of data as well as the granularity of which data is made available. We have developed a security framework that defines three different types of security policies: Access Control (AC) policies controlling data access, Process Control (PC) policies influencing how data is processed, and Granularity Control (GC) policies defining the Level of Detail (LOD) at which the data is made available. The security policies are interpreted as constraints which are supported by augmenting the SP graph by the relevant security policies. The operator placement in a DSPS is very important, as it deeply influences SP graph execution. Every stream-based application requires a different placement of SP graphs according to its specific objectives, e.g. bandwidth should not fall below 500 MBit/s and is more important than latency. This fact constrains operator placement. As objectives might conflict among each other, operator placement is subject to trade-offs. Knowing the bandwidth requirements of a certain application, an application developer can clearly identify the specific Quality of Service (QoS) requirements for the correct distribution of the SP graph. These requirements are a good indicator for the DSPS to decide how to distribute the SP graph to meet the application requirements. Two applications within the same DSPS might have different requirements. E.g. if interactivity is an issue, a stream-based game application might in a first place need a minimization of latency to get a fast and reactive application. We have developed a multi-target operator placement (M-TOP) algorithm which allows the DSPS to find a suitable deployment, i.e. a distribution of the operators in an SP graph which satisfies a set of predefined QoS requirements. Thereby, the M-TOP approach considers operator-specific deployment constraints as well as QoS targets.
Open Access
Integration management : a virtualization architecture for adapter technologies
(2015) Wagner, Ralf; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Integration management (IM) provides a means of systematically dealing with integration technologies. It abstracts from integration technologies so that software development is shielded from integration tasks. The achieved integration independence significantly alleviates maintenance and evolution of IT environments and reduces the overall complexity and costs of IT landscapes.
Open Access
Integration von Data Mining und Online Analytical Processing : eine Analyse von Datenschemata, Systemarchitekturen und Optimierungsstrategien
(2003) Schwarz, Holger; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Die technischen Möglichkeiten, Daten zu erfassen und dauerhaft zu speichern, sind heute so ausgereift, dass insbesondere in Unternehmen und anderen Organisationen große Datenbestände verfügbar sind. In diesen Datenbeständen, häufig als Data Warehouse bezeichnet, sind alle relevanten Informationen zu den Organisationen selbst, den in ihnen ablaufenden Prozessen sowie deren Interaktion mit anderen Organisationen enthalten. Vielfach stellt die zielgerichtete Analyse der Datenbestände den entscheidenden Erfolgsfaktor für Organisationen dar. Zur Analyse der Daten in einem Data Warehouse sind verschiedenste Ansätze verfügbar und erprobt. Zwei der wichtigsten Vertreter sind das Online Analytical Processing (OLAP) und das Data Mining. Beide setzen unterschiedliche Schwerpunkte und werden bisher in der Regel weitgehend isoliert eingesetzt. In dieser Arbeit wird zunächst gezeigt, dass eine umfassende Analyse der Datenbestände in einem Data Warehouse nur durch den integrierten Einsatz beider Analyseansätze erzielt werden kann. Einzelne Fragestellungen, die sich aus diesem Integrationsbedarf ergeben werden ausführlich diskutiert. Zu den betrachteten Fragestellungen gehört die geeignete Modellierung der Daten in einem Data Warehouse. Bei der Bewertung gängiger Modellierungsansätze fließen insbesondere die Anforderungen ein, die sich durch den beschriebenen Integrationsansatz ergeben. Als Ergebnis wird ein konzeptuelles Datenmodell vorgestellt, das Informationen in einer Weise strukturiert, die für OLAP und Data Mining gleichermaßen geeignet ist. Im Bereich der logischen Modellierung werden schließlich diejenigen Schematypen identifiziert, die die Integration der Analyseansätze geeignet unterstützen. Im nächsten Schritt sind die für Data Mining und OLAP unterschiedlichen Systemarchitekturen Gegenstand dieser Arbeit. Deren umfassende Diskussion ergibt eine Reihe von Defiziten. Dies führt schließlich zu einer erweiterten Systemarchitektur, die die Schwachstellen beseitigt und die angestrebte Integration geeignet unterstützt. Die erweiterte Systemarchitektur weist eine Komponente zur anwendungsunabhängigen Optimierung unterschiedlicher Analyseanwendungen auf. Ein dritter Schwerpunkt dieser Arbeit besteht in der Identifikation geeigneter Optimierungsansätze hierfür. Die Bewertung der Ansätze wird einerseits qualitativ durchgeführt. Andererseits wird das Optimierungspotenzial der einzelnen Ansätze auch auf der Grundlage umfangreicher Messreihen gezeigt.
Open Access
Issues on distributed caching of spatial data
(2017) Lübbe, Carlos; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Die Menge an digitalen Informationen über Orte hat bis heute rapide zugenommen. Mit der Verbreitung mobiler, internetfähiger Geräte kann nun jederzeit und von überall auf diese Informationen zugegriffen werden. Im Zuge dieser Entwicklung wurden zahlreiche ortsbasierte Anwendungen und Dienste populär. So reihen sich digitale Einkaufsassistenten und Touristeninformationsdienste sowie geosoziale Anwendungen in der Liste der beliebtesten Vertreter. Steigende Benutzerzahlen sowie die rapide wachsenden Datenmengen, stellen ernstzunehmende Herausforderungen für die Anbieter ortsbezogener Informationen dar. So muss der Datenbereitstellungsprozess effizient gestaltet sein, um einen kosteneffizienten Betrieb zu ermöglichen. Darüber hinaus sollten Ressourcen flexibel genug zugeordnet werden können, um Lastungleichgewichte zwischen Systemkomponenten ausgleichen zu können. Außerdem müssen Datenanbieter in der Lage sein, die Verarbeitungskapazitäten mit steigender und fallender Anfragelast zu skalieren. Mit dieser Arbeit stellen wir einen verteilten Zwischenspeicher für ortsbasierte Daten vor. In dem verteilten Zwischenspeicher werden Replika der am häufigsten verwendeten Daten von mehreren unabhängigen Servern im flüchtigen Speicher vorgehalten. Mit unserem Ansatz können die Herausforderungen für Anbieter ortsbezogener Informationen wie folgt addressiert werden: Zunächst sorgt eine speziell für die Zugriffsmuster ortsbezogener Anwendungen konzipierte Zwischenspreicherungsstragie für eine Erhöhung der Gesamteffizienz, da eine erhebliche Menge der zwischengespeicherten Ergebnisse vorheriger Anfragen wiederverwendet werden kann. Darüber hinaus bewirken unsere speziell für den Geo-Kontext entwickelten Lastbalancierungsverfahren den Ausgleich dynamischer Lastungleichgewichte. Letztlich befähigen unsere verteilten Protokolle zur Hinzu- und Wegnahme von Servern die Anbieter ortsbezogener Informationen, die Verarbeitungskapazität steigender oder fallender Anfragelast anzupassen. In diesem Dokument untersuchen wir zunächst die Anforderungen der Datenbereitstellung im Kontext von ortsbasierten Anwendungen. Anschließend diskutieren wir mögliche Entwurfsmuster und leiten eine Architektur für einen verteilten Zwischenspeicher ab. Im Verlauf dieser Arbeit, entstanden mehrere konkrete Implementierungsvarianten, die wir in diesem Dokument vorstellen und miteinander vergleichen. Unsere Evaluation zeigt nicht nur die prinzipielle Machbarkeit, sondern auch die Effektivität von unserem Caching-Ansatz für die Erreichung von Skalierbarkeit und Verfügbarkeit im Kontext der Bereitstellung von ortsbasierten Daten.
Open Access
Konzepte und Realisierung einer kontextbasierten Intranet-Suchmaschine
(2007) Mangold, Christoph M.; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Suchmaschinen sind ein wichtiges Werkzeug zur Recherche von Dokumenten - nicht nur im World Wide Web, sondern gleichermaßen im Intranet von Unternehmen. Herkömmliche Dokumentensuchmaschinen werten zur Beantwortung von Suchanfragen lediglich den Inhalt, d.h. den Text der Dokumente aus. Der Ansatz der vorliegenden Arbeit basiert darauf, dass nicht nur der Text sondern ebenfalls der Kontext der Dokumente in die Auswertung miteinbezogen wird. Die Kontextinformation der Dokumente wird dazu aus den Datenbanken des Unternehmens extrahiert. Die kontextbasierte Suche ist dabei nicht als Alternative zu herkömmlicher, textbasierter Suche zu sehen, sondern als eine Erweiterung. Wie bei vielen Suchmaschinen üblich, spezifiziert der Benutzer den jeweiligen Informationsbedarf nicht als Ausdruck einer formalen Sprache, sondern als Schlüsselwortanfrage. Zur Bestimmung der Dokumentenkontexte und als Abstraktion von Unternehmensdatenbanken wird ein graphenbasiertes Modell eingeführt, der ContextGraph. Die Knoten des ContextGraph repräsentieren einerseits Datenbankdaten und andererseits die vom System erfassten Dokumente. Die Kanten des ContextGraph modellieren Fremdschlüsselbeziehungen bzw. Beziehungen zwischen Tupeln und Attributwerten in der Datenbank. Jede Kante ist gewichtet mit einem Maß für den inhaltlichen bzw. semantischen Abstand der beiden Knoten die durch sie verbunden sind. Der ContextGraph bildet die Basis zur Berechnung des Kontexts von Dokumenten, welcher durch eine inkrementelle Kürzeste-Wege-Suche im ContextGraph bestimmt wird. Bei der Bearbeitung von Suchanfragen und bei der Bewertung der Resultate wird nicht nur der Text sondern zusätzlich der Kontext von Dokumenten, d.h. die im Kontext der Dokumente enthaltenenen Begriffe berücksichtigt. Um dies zu ermöglichen werden Bewertungsmaße für die kontextbasierte Relevanz von Dokumenten bzgl. Suchbegriffen, für die kontextbasierte Wichtigkeit von Dokumenten und für die kontextbasierte Ähnlichkeit von Dokumenten entworfen. Diese Bewertungsmaße werden umgesetzt als eine Erweiterung des im Suchmaschinenbereich bewährten tf.idf-Bewertungsmaßes zur Bestimmung der Begriffsgewichte im Vektorraummodell. Um den Ansatz praktisch zu erproben wird eine Architektur entworfen und darauf aufbauend ein prototypisches System zur kontextbasierten Suche implementiert. Damit Skalierbarkeit erreicht werden kann, verfolgt die Suchmaschine den indexbasierten Ansatz. Zur Indexierungszeit wird der Datenbestand erhoben und in Datenstrukturen, sog. Indexen, abgelegt, die eine effiziente Verarbeitung von Suchanfragen zur Anfragezeit unterstützen. Das implementierte System wird anhand zweier Szenarien analysiert. Dafür werden jeweils alternative Implementierungen der kontextbasierten Suche mit einer Implementierung der rein textbasierten Suche verglichen. Besonderes Augenmerk gilt dabei der Skalierbarkeit des Systems und einem Parameter zur Einstellung der vom System beachteten Kontextgröße. Die Messergebnisse quantifizieren einerseits den durch die Betrachtung des Kontexts nötigen Mehraufwand gegenüber der Textsuche. Andererseits wird die Qualität der Suchergebnisse analysiert. Die Auswertung der Messergebnisse belegen einen moderaten durch die Beachtung des Kontexts hervorgerufenen Mehraufwand, der sich - je nach Implementierung der Indexstrukturen - mehr im Aufwand zur Bearbeitung von Suchanfragen oder mehr im Aufwand bei der Erstellung des Index niederschlägt. In beiden analysierten Szenarien ergibt sich demgegenüber jedoch eine durch die Beachtung von Kontextinformation deutliche Verbesserung der Qualität der Suchresultate.
Open Access
Konzepte und Techniken der Datenversorgung für komponentenbasierte Informationssysteme
(1999) Sellentin, Jürgen; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Rechnergestützte Informationssysteme stellen heutzutage für viele Branchen ein unverzichtbares Hilfsmittel dar. Ohne sie wäre die Komplexität von Abläufen und die damit verbundene Menge von Daten kaum noch zu bewältigen. Dieser Sachverhalt trifft insbesondere für die Entwicklung neuer Produkte zu, bei der zunächst extrem viele Daten aus vorangegangenen Arbeiten und zugrundeliegenden Richtlinien zu berücksichtigen sind. Gleichzeitig entsteht während der Entwicklung eine Menge neuer Daten, die später als Grundlage der Produktion dienen. Wir betrachten deshalb rechnergestützte Entwurfsumgebungen als repräsentatives Beispiel für datenintensive Informationssysteme, bei denen sowohl große Mengen von Daten gelesen als auch erzeugt bzw. geschrieben werden. Anhand dieses Szenarios werden wir deshalb die einzelnen Aspekte und Probleme diskutieren und verdeutlichen. Neben der reinen Diskussion von Datenversorgungsstrategien wollen wir weiterhin ausgewählte Methoden anhand eines Prototypen evaluieren. Als Basis dient uns dabei die neu entwickelte Anbindung des SDAI (Standard Data Access Interface) von STEP an die Sprache Java (ISO 10303-27). Diese wurde im Rahmen der vorliegenden Arbeit wesentlich mitgestaltet und ermöglicht den simultanen Zugriff auf unterschiedliche Datenquellen über unterschiedliche Datenversorgungsstrategien. Wir werden mit unseren Prototypen zwei verschiedene CORBA-basierte Lösungen einem JDBC-basierten Ansatz gegenüberstellen. Die Datenquellen und ihre Zugriffsschnittstellen sind dabei als sog. Data Modules in die SDAI-Schnittstelle integriert. Es zeigt sich, daß CORBA unter gewissen Umständen zur Realisierung einer effizienten Datenversorgung benutzt werden kann, das zugrundeliegende Modell aber nicht dem eigentlichen Grundgedanken von CORBA entspricht. Insbesondere lassen sich nur wenige der standardisierten CORBA-Komponenten (sog. Services und Facilities) benutzen.
Open Access
Korrektheit und deren Durchsetzung im Umfeld langdauernder Abläufe
(2001) Schwenkreis, Friedemann; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
In der vorliegenden Dissertation wird ein Modell erarbeitet, um transaktionale Eigenschaften, wie man sie aus dem Datenbankbereich kennt, auf langdauernde Abläufe übertragen zu können. Insbesondere fokusiert die Arbeit dabei auf den Problembereich des parallelen Zugriffs auf dieselben Datenobjekte. Zur formalen Beschreibung langlebiger Abläufe geht der vorgestellte Ansatz davon aus, dass ein Ablauf mit Hilfe des sogenannten ConTract-Modells beschrieben werden kann. Für diese Abläufe wird eine Transformation auf eine formale Notation bzw. eine formale Ausführungsmaschine erarbeitet und ein Historienbegriff definiert, der dem transaktionalen Historienbegriff ähnelt. Hierauf aufbauend wird eine Korrektheitsbegriff erarbeitet, der es erlaubt über die Korrektheit von Abläufen zu entscheiden. Anschließend beschäftigt sich die Arbeit mit der Umsetzung des Korrektheitsbegriffs in ein Laufzeitsystem zur Sicherstellung der Korrektheit. Es wird dabei speziell auf die Aspekte eingangen, die sich bei einer Einbettung der Mechanismen in das prototypisch vorliegende Laufzeittsystem von ConTracts (APRICOTS) ergeben. Insbesondere werden hier die Aspekte der Verteilung und der Ausfallsicherheit diskutiert. Die Arbeit schliesst mit der Identifikation offener Arbeitsgebiete sowie von Einschränkungen, die das ConTract-Modell noch aufweist. Es wird dabei deutlich, dass transaktional Zusicherungen nicht ohne Weiteres verallgemeinert werden können und möglicherweise neue Wege einzuschlagen sind, um weitere Anwendungsfelder entsprechend unterstützen zu können.
Open Access
Metadata management in virtual product development to enable cross-organizational data analytics
(2024) Ziegler, Julian; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
Due to the advancing digitalization, companies are increasingly adopting computer-aided technologies. Especially in product development, computer-aided technologies enable a gradual shift from physical to virtual prototypes. This shift towards virtual product development includes design, simulation, testing, and optimization of products, and reduces costs and time needed for these tasks. Companies with strong activities in the field of virtual product development generate large amounts of heterogeneous data and wish to mine these data for knowledge. In this context, metadata is a key enabler for data discovery, data exploration, and data analyses but often neglected. The diversity in the structure and formats of virtual product development data makes it difficult for domain experts to analyze them. Domain experts struggle with this task because such engineering data are not sufficiently described with metadata. Moreover, data in companies are often isolated in data silos and difficult to explore by domain experts. This calls for an adequate data and metadata management that is able to cope with the significant data heterogeneity in virtual product development, and that empowers domain experts to discover and access data for further analyses. This thesis identifies previously unsolved challenges for a data and metadata management that is tailored to virtual product development and makes three contributions. First, a metadata model that provides a connected view on all data, metadata, and work activities of virtual product development projects. A prototypical implementation of this metadata model is already being applied to a real-world use case of an industry partner. Based on this foundation, the second contribution uses this metadata model to enable feature engineering with domain experts as part of data analyses projects. Going further, data analyses can directly use the metadata structure to provide added value without having to access the large amounts of product data. To this end, the third contribution utilizes the metadata structure itself to enable a novel approach to process discovery for product development projects. Thus, process structures in development projects can be analyzed with little effort, e.g., to identify good or inefficient processes in development projects.
Open Access
A model-based approach for data processing in IoT environments
(2020) Franco da Silva, Ana Cristina; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)
The recent advances in several areas, including sensor technologies, networking, and data processing, have enabled the Internet of Things (IoT) vision to become more and more a reality every day. As a consequence of these advances, the IoT of today allows the development of sophisticated applications for IoT environments, such as smart cities, smart homes, or smart factories. Due to continuous sensor measurements and frequent data exchange among so-called IoT objects, the data generated within an IoT environment incorporate the form of data streams. With this increasing amount of data to be continuously processed, several challenges arise while aiming at an efficient processing of IoT data. For instance, how IoT data processing can be realized, so that meaningful information can be derived without affecting the reactiveness of IoT applications. Furthermore, how different functional, non-functional, and user-defined requirements of IoT applications can be satisfied by the IoT data processing. In this PhD thesis, a new holistic approach for processing data stream-based applications within IoT environments is presented. Its focus lies on efficient placement of operators of data stream applications onto heterogeneous, distributed, dynamic IoT environments. In contrast to state-of-the-art operator placement, this approach takes into consideration additional requirements introduced by the peculiar characteristics of the Internet of Things. Furthermore, non-functional and user-defined requirements are also taken into consideration. This PhD thesis is supported by different informational models and operator placement techniques, so that the entire life cycle of IoT environments and data stream-based applications can be easily managed. IoT environments and their processing capabilities are described by IoT environment models (IoTEM). Likewise, the business logic of IoT applications and their requirements are defined by data stream processing models (DSPM). Based on these informational models, several algorithms determine feasible placements of processing operators onto IoT objects of IoT environments, so that the aforementioned requirements and capabilities are matched. In this approach, one of the main goals is to process IoT data as near to data sources as possible, so that cloud infrastructures are employed only in cases where IoT environments do not offer sufficient processing resources for the IoT application. The execution of data processing on both IoT environments and cloud infrastructures is commonly known as fog computing. Through the approach of this PhD thesis, data processing of IoT applications can be tailored to particular use cases, supporting the specific requirements of the domains, and furthermore, of IoT application users. Once feasible placements are determined, processing operators are then deployed onto corresponding IoT objects using standards, such as TOSCA, and the IoT application is considered up and running. Finally, the IoT environment is continuously monitored in order to recognize and react to disturbances affecting the data processing of deployed IoT applications. The approach of this PhD thesis is supported by the Multi-purpose Binding and Provisioning Platform (MBP), an open-source IoT platform, which has been developed as a proof-of-concept of the contributions of this PhD thesis.