05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 15

Open Access
The perfect time span : on the present perfect in German, Swedish and English
(2006) Rothstein, Björn Michael; Kamp, Hans (Prof. Dr. h.c. PhD)
This study proposes a discourse based approach to the present perfect in German, Swedish and English. It is argued that the present perfect is best analysed by applying an ExtendedNow-approach. It introduces a perfect time span in which the event time expressed by the present perfect is contained. The present perfects in these languages differ with respect to the boundaries of perfect time span. In English, the right boundary is identical to the point of speech, in Swedish it can be either at or after the moment of speech and in German it can also be before the moment of speech. The left boundary is unspecified. The right boundary is set by context.
Open Access
Fehlerbehandlung in Mensch-Maschine-Dialogen
(2007) Gieselmann, Petra; Rohrer, Christian (Prof.)
Seit es Computer gibt, existiert auch der Wunsch des Menschen, mit ihnen reden zu können wie mit einem anderen Menschen. Eines der berühmtesten Beispiele dafür ist sicherlich Eliza, ein Computerprogramm, das einen Psychologen simuliert, mit dem der Benutzer ein Therapiegespräch führen kann. In vielen Science-Fiction-Filmen finden sich auch immer wieder Beispiele für solche sprechenden Maschinen, wie beispielsweise HAL in 2001: Odysee im Weltraum'' oder auch der Computer auf dem Raumschiff Enterprise''. So reichen erste Dialogsysteme bereits zurück bis in die Anfänge der künstlichen Intelligenz in den fünfziger Jahren. Dennoch hatten diese Dialogsysteme bis vor wenigen Jahren noch mit so vielen Problemen zu kämpfen, dass sie kaum für einen praktischen Einsatz geeignet waren. Erst in letzter Zeit ist es durch die stetigen Verbesserungen im Bereich von Spracherkennung und Sprachverstehen und das Aufkommen von immer schnelleren und mächtigeren Rechnern möglich geworden, solche Systeme für den realen Einsatz zu bauen. Nach wie vor gibt es aber noch eine ganze Reihe ungelöster Probleme, die zum einen auf die Komplexität natürlicher Sprache und zum anderen auf den immensen Fundus an vernetztem Weltwissen und Kontextbeziehungen, über den Menschen verfügen, zurückzuführen sind. Eine der bislang größten Herausforderungen liegt darin, ein solches Dialogsystem auch für den realen Einsatz unter Alltagsbedingungen zu entwerfen. Bisher fehlt den Systemen dafür noch die nötige Fehlerrobustheit, um in Situationen, in denen das System etwas falsch verstanden hat und es zu Problemen kommt, angemessen reagieren zu können. In dieser Arbeit geht es genau um solche Fehler im Dialog, wie sie vermieden und während des laufenden Dialogs wieder behoben werden können, wenn sie nicht vorher zu vermeiden waren. Der Gegenstand dieser Arbeit ist eine datengetriebene Analyse der Fehler, die in der Mensch-Roboter-Kommunikation auftreten mit dem Ziel, diese möglichst im Vorfeld zu vermeiden. Es wird eine Fehlerklassifikation aufgestellt und es werden Methoden für die Vermeidung der verschiedenen Fehlerklassen entwickelt und evaluiert. Darüberhinaus werden auch generische Methoden zur Fehlerbehebung für die Fälle implementiert, die nicht vorher vermieden werden konnten, ebenfalls mit Hilfe datengetriebener Analysen. Damit soll es ermöglicht werden, Dialogsysteme über die Laborumgebung hinaus in realen Situationen einsetzen zu können. Dies wird am Beispiel eines Haushaltsroboters diskutiert und evaluiert. Diese Ausarbeitung gliedert sich in vier Teile: Der erste Teil beschäftigt sich mit dem Stand der Forschung in den Bereichen, die hier eine Rolle spielen. Dazu werden verschiedene Ansätze für Mensch-Maschine-Dialogsysteme beleuchtet. Im Anschluss wird die menschliche Informationsverarbeitung im Dialogbereich erläutert. Dabei geht es auch um Fehlerdialoge in zwischenmenschlichen Dialogen, die hier als Vorbild für Mensch-Roboter-Dialoge dienen. Der zweite Teil beschäftigt sich mit den durchgeführten Benutzertests und Datensammlungen und der Klassifikation von Fehlern im Dialog, die die Grundlage für die folgenden Arbeiten zur Fehlervermeidung und -behebung bilden. Zunächst erfolgt eine detaillierte Analyse von Fehlern, die bei der Mensch-Roboter-Interaktion auftreten können. Dazu werden verschiedene aufeinander aufbauende Benutzerstudien und Datensammlungen, bei denen der Roboter dem Menschen im Haushalt zur Hand geht und einfache Tätigkeiten verrichtet, durchgeführt, um eine große Menge an möglichst realistischen Daten gewinnen zu können, die nicht nur unter Laborbedingungen entstanden sind. Im dritten Teil werden verschiedene Methoden zur Fehlervermeidung und -behebung vorgestellt. Zur Fehlervermeidung werden zusätzliche Wissensquellen in den Dialogmanager integriert. Außerdem werden Mechanismen zur Anaphernresolution, Kontextmodellierung, Auflösung von Ellipsen, multimodalen Fusion und zum Umgang mit komplexen, zusammengesetzten Äußerungen entwickelt und evaluiert. Zur Fehlerbehebung werden verschiedene Strategien für effektive Klärungsfragen untersucht. Metakommunikation, wie sie in den durchgeführten Benutzertests vorkommt, wird analysiert, um eine effektivere Kommunikation gewährleisten zu können. Außerdem wird ein Mechanismus entwickelt, der es dem Roboter erlaubt, problematische Situationen zu erkennen und diese selbst durch Metakommunikation aufzulösen. Im vierten Teil werden die entwickelten Methoden anhand eines abschließenden Benutzertests evaluiert. Dabei geht es darum, das System mit allen entwickelten Mechanismen zur Fehlerbehandlung zu testen und es mit dem Basissystem zu vergleichen. Das besondere Augenmerk liegt hier auf der Übertragbarkeit der entwickelten Mechanismen auf andere Domänen und Systeme. Danach folgt das Fazit der gesamten Arbeit und eine Diskussion der zukünftigen Arbeiten im Hinblick auf mögliche Erweiterungen dieses Systems.
Open Access
Disambiguation and reambiguation
(2009) Hamm, Fritz; Kamp, Hans; Solstad, Torgrim; Roßdeutscher, Antje (ed.)
The papers in this volume developed as part of the two projects "The Role of Lexical Information in the Context of Word-formation, Sentence and Discourse" and the project "Representation of Ambiguities and their Resolution in Context". In the former, a theory of "-ung"-nominalisation in German has been developed. The two papers presented in this volume focus on the second part of the joint enterprise of the two projects, namely on disambiguation of "-ung"-nouns in context. Hamm and Kamp study a proto-typical example, "die Absperrung der Botschaft" "the cordoning-off of the embassy", which is three-way ambiguous. This DP can denote a material object (the fence used for cordoning-off), an event (the process of cordoning-off) or a result state (the embassy being cordoned off). Formally, this three-way ambiguity is represented by an underspecified DRS. The paper contributes a partial answer to the general question which contextual factors are responsible for the (partial) disambiguation of this DP in discourse. The disambiguation process is described on the level of DRT. Building on the results in the first paper, the second paper by Hamm and Solstad focuses on problems that arise in anaphora resolution of pronouns with ambiguous nouns like "die Absperrung der Botschaft" as antecedent. What happens if the selection restriction of the verb in the antecedent sentence and that of the consequent sentence are incompatible? This situation is exemplified in (1): (1) Die Absperrung der Botschaft wurde vorgestern von Demonstranten behindert. Wegen anhaltender Unruhen wird SIE auch heute aufrecht erhalten. "The cordoning-off of the embassy was hampered by protesters the day before yesterday. Due to continuing unrest, it [the state of being cordoned off] is sustained today as well." "Behindern" "to hamper" filters out both the entity-reading and the result state reading of "Absperrung", but the verb "aufrecht erhalten" "to sustain" requires the result state as its argument. Thus, in order for the anaphoric pronoun 'sie' to be resolved successfully, the first sentence should provide a result state which, however, is not available, if the result state reading has been erased. Hamm and Solstad show that the required result state can be reconstructed - even under the assumption that "behindern" erases the result state reading of the first sentence in (1). This is achieved in a process of "reambiguation". Reambiguation involves a non-monotonic inference process. The question arise what triggers this process and what its restrictions are. Hamm and Solstad provide formally precise answers to these questions. Again a combination of UDRT and the event calculus provide the framework where these puzzles can be solved.
Open Access
Semantic and pragmatic aspects of some particular uses of contrast marking
(2006) Soffner, Martin; Bäuerle, Rainer (apl. Prof. Dr.)
The aim of the thesis is an analysis of the relational meaning of contrast, as realised through the English co-ordinating conjunction "but". Since the analysis is to be based on well-defined contexts, first the essential properties of some selected utterance contexts of "but" have to be described. Then we seek after interrelations between "but" and those properties. In a stricter sense, this thesis is a case study of the use of "but" in answer situations: It is a question which defines the utterance contexts of "but" being considered. The definition of these dialogical contexts is a central concern. A question semantics that seems adequate for this purpose is one which determines questions by way of their possible answers (Groenendijk & Stokhof 1984). In order for an utterance to count as an answer, it has to meet some constraints that are due to the question. In particular, a question context provides a question domain that the answerer has to take into account. The impact of such a domain on the interpretation of utterances is an important aspect. What role might contrast play in a question context in which the exhaustivity condition must be fulfilled? We then turn to a notion of context extended by a superordinate problem (issue). Issues are a speaker's motive for asking a question. There are some descriptive concepts that model this motive for utterances, e.g. Conversational Topic (van Kuppevelt 1996), Question under Discussion (Roberts 1996), and Decision Problem (van Rooy 2003b). A simple model of issue is introduced that involves another kind of domain which differs from the question domain. This kind of domain can be distinguished and described by means of a semantics of counterfactuals (Kratzer 1981a). The notion of perspective is introduced; it is capable of subsuming both kinds of domains, consisting of individuals or of propositions focused by the speaker. The use of "but" also interrelates with the issue-related domain: Contrast can be explained by a shift of perspective; the domain currently focused by the speaker changes in a characteristic way. In a felicitous discourse, the answer can be expected to be aligned with the issue behind the question. If "but" indicates a change of the question domain, then this should also reflect a change of the decision of the issue. So contrast can be said to interrelate with the perspective chosen by the speaker.
Open Access
Brain, meaning, and computation
(2007) Klein, Michael; Kamp, Hans (Prof. Dr. h.c. PhD)
This thesis deals with the question how the human brain acquires, represents, and processes the meaning of natural language expressions. A computational neural theory of meaning is introduced with the goal of overcoming the strong prevalence of empirical results over theoretical understanding that is currently present in the neuroscience of language. In this context, the brain is regarded as a goal-directed system, which acquires language and meaning as one means for achieving its goals. To accomplish complex learning tasks, such as acquiring a language, the brain uses subsystems, which differ especially with respect to their learning strategies, but interact so as to achieve the global goals of the system.
Open Access
Extracting information for biology
(2006) Saric, Jasmin; Reyle, Uwe (Apl. Prof. Dr.)
High-throughput methods like the large scale sequencing of the human genome dramatically increase our knowledge of genetics and related biological processes. As a consequence these results accelerate the pace of research and development in the field of biomedicine. The overall goal of these research efforts is to obtain new findings about diseases in order to improve human health. However, these advances are responsible for an increase in complexity and a need for understanding when applying biomedical research and data. Meanwhile there is a strong agreement within life-science related academic laboratories and industry that addressing the complexity of biological data and knowledge entails intense interdisciplinary efforts. A major requirement for interdisciplinary research within life sciences is to correlate the data that is derived from text with data from experiments in biomedical laboratories (and with patient records). The main contribution of this work is to describe how natural language processing (NLP) methods and systems can fulfill this requirement by categorising, structuring and exploiting the massive amount of textual data available and in integrating the results with data derived from biomedical experiments. The present work is thematically divided into three parts. The first part is about text mining in the life sciences and is subdivided in two subsections. Subsection I presents an introduction to effective natural language processing techniques for identifying and retrieving information from large text collections. Furthermore it presents the characteristic features of biomedical terminology, which comprise synonymic, homonymic, orthographical, paragrammatical as well as other types of variance. This illustrates that the crucial difference between everyday language and the language used within biomedical scientific literature is mainly based on the difference of the terminology used. This subsection concludes with a description of basic criteria that an information extraction system has to meet. The implementation of such an information extraction system is described in the second subsection. This section documents a pilot study that was carried out in close collaboration with both the SDBV (Scientific DataBases and Visualisation) group of EML Research gGmbH and Peer Bork's group at the EMBL (European Molecular Biology Laboratory) both located in Heidelberg. The system implemented is used for the extraction of information on gene expression relations from biomedical scientific publications. The second part III focuses on the transfer of a computational linguistic tool (TIGERSearch), which was originally developed for the querying of hierarchical structures, to querying knowledge on protein domains from a protein database. It is demonstrated that TIGERSearch offers the possibility to make implicit knowledge about protein domains explicit by transforming the database entries to TIGERSearch-XML. In addition, TIGERSearch makes this implicit knowledge graphically visible. In fact, TigerSearch was initially developed for the querying and transparent representation of syntactically annotated corpora, so-called treebanks. This part also points out the problem that mapping the wide range of natural language annotations to precisely defined concepts presupposed by the search engine requires an ontological modelling of the domain. The third part addresses the problem of ontological modelling in a more general and more comprehensive way. It consists of two chapters. The first chapter introduces basic notions of ontologies as well as an overview of guidelines to be considered when building an ontology. In addition some examples of implemented (both general and biomedical) ontologies are presented. The second chapter presents an axiomatisation of a sub-domain of molecular biology (i.e. gene expression) that comprises the domain of proteins and their domains. The thesis demonstrates a highly interdisciplinary approach for text mining in the life sciences. Methods and knowledge from the fields of natural language processing, bioinformatics and biology have been successfully combined with knowledge from cell-biology and the problem of extracting knowledge from unstructured or partially structured data.
Open Access
Semantics of projective locative expressions : an empirical evaluation of geometrical conditions
(2009) Hying, Christian; Kamp, Hans (Prof. Dr. h.c., Ph.D.)
This thesis presents a method for evaluating semantic theories of projective locative expressions such as "X is above Y" and "X to the right of Y". The method is implemented for semantic theories that represent meaning of projective locative expressions in terms of geometrical constraints in two-dimensional space. A set of semantic theories is defined according to proposals from the literature. These theories predict precise geometrical constraints for projective locative expressions. Furthermore, a formalism is proposed which is used to combine these theories in order to generate new semantic theories that are capable of handling vagueness of projective locative expressions. The empirical basis of the evaluation is a set of expressions that subjects of a "map task" experiment (Anderson et al., 1991) have used to describe spatial relations in two-dimensional space. Each expression refers to a specific map of which two-dimensional geometrical representations are derived. The semantic theories are tested with these data by checking whether the geometrical constraints predicted for an expression are satisfied by the corresponding geometrical representation. The evaluations show good results for most theories which have been proposed in the literature. The results are systematically improved by the corresponding theories that handle vagueness.
Open Access
Textvorverarbeitung zur deutschen Version des Festival Text-to-Speech Synthese Systems
(1997) Breitenbücher, Mark
Im Rahmen einer Studienarbeit sollte eine Textvorverarbeitungskomponente fuer das Multilinguale Sprachsynthesesystem FESTIVAL entwickelt werden. Unter anderem wurden Verfahren zur Erkennung und Expansion verschiedener Zahlenformate, Abkuerzungenund Sonderzeichen implementiert. Des weiteren wurden Ueberlegungen zur Anbindung einer morphosyntaktischen Komponente angestellt, die in einer weiteren Studien- oder Diplomarbeit vollzogen werden soll. Inhalt: Einfuehrung, Was ist Text-to-Speech Synthese?, Wozu Text-to-Speech Synthese?, Aufbau eines TTS-Systems, Die Textvorverarbeitung, linguistische Verarbeitung, Synthese, Das Festival Speech Synthese System, Die Benutzung von Festival, Die Aeusserungen in Festival, Die Module in Festival, Initialisierung, Tokenisierung, Token POS, Token-to-Word Regeln, POS Tagging, Phrasierung, Lexikon-Lookup, Intonation 1, Dauer, Intonation 2, Synthese, Die Werkzeuge von Festival, Erweiterung der CART trees, Erweiterung des Festival-Regex-Tools, Festival-Regex: string-matches, Die Erweiterung: pattern-matches, Die deutsche Textvorverarbeitung in Festival, Aufsplittung von Zusammensetzungen, Expansion von Zahlen, Brueche, Verhaeltniszahlen, Telefonnummern, Zusammensetzungen, Jahreszahlen, Datumsangaben, Uhrzeiten, Geldbetraege, Dezimalbrueche, Dezimalzahlen
Open Access
Neural correlates of fricative contrasts across language boundaries
(2006) Lipski, Silvia C.; Dogil, Grzegorz (Prof. Dr.)
The phonological system, the way that speech sounds contrast and combine in a language to create lexical contrast, determines, to a large part, how they are perceived. The aim of this study was to determine whether native language experience affected the processing of voiceless fricatives in the auditory cortex. Three fricative contrasts were tested, one that is used phonemically in both Polish and German, one that is only phonemic in Polish. Moreover, it was tested if the contrastive function of native, phonetically distinct fricatives affects early auditory processing by using an allophonic German fricative contrast. Speech perception is a very fast process and it is, to the most part, not consciously accessible. Investigations of underlying neural mechanisms of speech perception, therefore, require the usage of methods that can record auditory responses to speech with high temporal precision, such as electroencephalography (EEG) and magnetoencephalography (MEG). The mismatch negativity (MMN) component of the auditory event-related potential (ERP) and its magnetic counterpart (MMNm) signal auditory discrimination and auditory sensory memory. Recent studies provided evidence that the MMN reflects memory of native language speech sound categories. A reduction of the MMN amplitude could be expected for the German listeners' responses to the Polish contrast as compared to the responses of the native listeners. If the phonological function of a speech sound is incorporated in its memory representation, an amplitude difference between the response to a phoneme and an allophone contrast was expected. If a phoneme representation which can be understood as an abstract label, arbitrary to the phonetic qualities of the sounds, is accessed during early auditory processing, phonetically distinct would be mapped to this unique representation, not to two distinct phonological representations. As a consequence, the allophone contrast may evoke a lower amplitude of the MMNm than phoneme sounds.
Open Access
A graph model for words and their meanings
(2006) Dorow, Beate; Heid, Ulrich (PD, Dr. phil)
In this thesis we take a graph-theoretic approach to the automatic acquisition of word meanings. We represent the nouns in a text in form of a semantic graph consisting of words (the nodes) and relationships between them (the links). Links in the graph are based on cooccurrence of nouns in lists. We find that valuable information about the meaning of words and their interactions can be extracted from the resulting semantic structure.

05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Browse

Filters

Settings

Sort By

Results per page

Search Results