05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 55
  • Thumbnail Image
    ItemOpen Access
    Strukturierte Modellierung von Affekt in Text
    (2020) Klinger, Roman; Padó, Sebastian (Prof. Dr.)
    Emotionen, Stimmungen und Meinungen sind Affektzustände, welche nicht direkt von einer Person bei anderen Personen beobachtet werden können und somit als „privat“ angesehen werden können. Um diese individuellen Gefühlsregungen und Ansichten dennoch zu erraten, sind wir in der alltäglichen Kommunikation gewohnt, Gesichtsausdrücke, Körperposen, Prosodie, und Redeinhalte zu interpretieren. Das Forschungsgebiet Affective Computing und die spezielleren Felder Emotionsanalyse und Sentimentanalyse entwickeln komputationelle Modelle, mit denen solche Abschätzungen automatisch möglich werden. Diese Habilitationsschrift fällt in den Bereich des Affective Computings und liefert in diesem Feld Beiträge zur Betrachtung und Modellierung von Sentiment und Emotion in textuellen Beschreibungen. Wir behandeln hier unter anderem Literatur, soziale Medien und Produktbeurteilungen. Um angemessene Modelle für die jeweiligen Phänomene zu finden, gehen wir jeweils so vor, dass wir ein Korpus als Basis nutzen oder erstellen und damit bereits Hypothesen über die Formulierung des Modells treffen. Diese Hypothesen können dann auf verschiedenen Wegen untersucht werden, erstens, durch eine Analyse der Übereinstimmung der Annotatorinnen, zweitens, durch eine Adjudikation der Annotatorinnen gefolgt von einer komputationellen Modellierung, und drittens, durch eine qualitative Analyse der problematischen Fälle. Wir diskutieren hier Sentiment und Emotion zunächst als Klassifikationsproblem. Für einige Fragestellungen ist dies allerdings nicht ausreichend, so dass wir strukturierte Modelle vorschlagen, welche auch Aspekte und Ursachen des jeweiligen Gefühls beziehungsweise der Meinung extrahieren. In Fällen der Emotion extrahieren wir zusätzlich Nennungen des Fühlenden. In einem weiteren Schritt werden die Verfahren so erweitert, dass sie auch auf Sprachen angewendet werden können, welche nicht über ausreichende annotierte Ressourcen verfügen. Die Beiträge der Habilitationsarbeit sind also verschiedene Ressourcen, für deren Erstellung auch zugrundeliegende Konzeptionsarbeit notwendig war. Wir tragen deutsche und englische Korpora für aspektbasierte Sentimentanalyse, Emotionsklassifikation und strukturierte Emotionsanalyse bei. Des Weiteren schlagen wir Modelle für die automatische Erkennung und Repräsentation von Sentiment, Emotion und verwandten Konzepten vor. Diese zeigen entweder bessere Ergebnisse, als bisherige Verfahren oder modellieren Phänomene erstmalig. Letzteres gilt insbesondere bei solchen Methoden, welche auf durch uns erstellte Korpora ermöglicht wurden. In den verschiedenen Ansätzen werden wiederkehrend Konzepte gemeinsam modelliert, sei es auf der Repräsentations- oder der Inferenzebene. Solche Verfahren, welche Entscheidungen im Kontext treffen, zeigen in unserer Arbeit durchgängig bessere Ergebnisse, als solche, welche Phänomene getrennt betrachten. Dies gilt sowohl für den Einsatz künstlicher neuronaler Netze, als auch für die Verwendung probabilistischer graphischer Modelle.
  • Thumbnail Image
    ItemOpen Access
    Task-oriented specialization techniques for entity retrieval
    (2020) Glaser, Andrea; Kuhn, Jonas (Prof. Dr.)
    Finding information on the internet has become very important nowadays, and online encyclopedias or websites specialized in certain topics offer users a great amount of information. Search engines support users when trying to find information. However, the vast amount of information makes it difficult to separate relevant from irrelevant facts for a specific information need. In this thesis we explore two areas of natural language processing in the context of retrieving information about entities: named entity disambiguation and sentiment analysis. The goal of this thesis is to use methods from these areas to develop task-oriented specialization techniques for entity retrieval. Named entity disambiguation is concerned with linking referring expressions (e.g., proper names) in text to their corresponding real world or fictional entity. Identifying the correct entity is an important factor in finding information on the internet as many proper names are ambiguous and need to be disambiguated to find relevant information. To that end, we introduce the notion of r-context, a new type of structurally informed context. This r-context consists of sentences that are relevant to the entity only to capture all important context clues and to avoid noise. We then show the usefulness of this r-context by performing a systematic study on a pseudo-ambiguity dataset. Identifying less known named entities is a challenge in named entity disambiguation because usually there is not much data available from which a machine learning algorithm can learn. We propose an approach that uses an aggregate of textual data about other entities which share certain properties with the target entity, and learn information from it by using topic modelling, which is then used to disambiguate the less known target entity. We use a dataset that is created automatically by exploiting the link structure in Wikipedia, and show that our approach is helpful for disambiguating entities without training material and with little surrounding context. Retrieving the relevant entities and information can produce many search results. Thus, it is important to effectively present the information to a user. We regard this step beyond the entity retrieval and employ sentiment analysis, which is used to analyze opinions expressed in text, in the context of effectively displaying information about product reviews to a user. We present a system that extracts a supporting sentence, a single sentence that captures both the sentiment of the author as well as a supportingfact. This supporting sentence can be used to provide users with an easy way to assess information in order to make informed choices quickly. We evaluate our approach by using the crowdsourcing service Amazon Mechanical Turk.
  • Thumbnail Image
    ItemOpen Access
    Analysis of political positioning from politician’s tweets
    (2023) Maurer, Maximilian Martin
    Social media platforms such as Twitter have become important communication channels for politicians to interact with the electorate and communicate their stances on policy issues. In contrast to party manifestos, which lay out curated, compromised positions, the full range of positions within the ideological bounds of a party can be found on social media. This begs the question of how aligned the ideological positions of parties on social media are with their respective manifesto. To assess the alignment of social media and manifesto positions, we correlate the positions automatically retrieved from the tweets with manifesto-based positions for the German federal elections of 2017 and 2021. Additionally, we assess whether the change in positions over time is aligned between social media and manifestos. We retrieve ideological positions by aggregating distances between parties from sentence representations of their members' tweets from a corpus containing >2M individual tweets of 421 German politicians. We leverage domain-specific information by training a sentence embedding model such that representations of tweets with co-occurring hashtags are closer to each other than ones without co-occurring hashtags, following the assumption that hashtags approximate policy-related topics. Our experiments compare this political social media domain-specific model with other political domain and general domain sentence embedding models. We find high, significant correlations between the Twitter-retrieved positions and manifesto positions, especially for our domain-specific fine-tuned model. Moreover, for this model, we find overlaps in terms of how the positions change over time. These results indicate that the ideological positions of parties on Twitter correspond to the ideological positions as laid out in the manifestos to a large extent.
  • Thumbnail Image
    ItemOpen Access
    Evaluating methods of improving the distribution of data across users in a corpus of tweets
    (2023) Milovanovic, Milan
    Corpora created from social network data often serve as the data source for tasks in natural language processing. Compared to other, more standardized corpora, social media corpora have idiosyncratic properties due to the fact that they consist of user-generated comments. These are, for example, the unbalanced distribution of the respective comments, a generally lower linguistic quality, and an inherently unstructured and noisy nature. Using a Twitter-generated corpus, I will investigate to what extent the unbalanced distribution of the data has an influence on two downstream tasks, relying on word embeddings. Word embeddings are a ubiquitous and frequently used concept in the field of natural language processing. The most common models are often the means to obtain semantic information about words and their usage by representing the words in an abstract word vector space. The basic idea is that semantically similar words in the mapped vector space have similar vectors. In doing so, these vectors serve as input for standard downstream tasks such as word similarity and semantic change detection. One of the most common models in current research is the use of word2vec, and more specifically, the Skip-gram architecture of this model. The Skip-gram architecture attempts to predict the surrounding words based on the current word. The data on which this architecture is trained greatly influences the resulting word vectors. In the context of this work, however, no significant improvement in the results to a fully preprocessed corpus could be found when filtering methods, widely used in the literature, without specific motivation, are used to select a subset of data according to defined criteria, neither for word similarity nor for semantic change detection. However, comparable results could be achieved with some filters, although the resulting models were trained using significantly fewer tokens as input.
  • Thumbnail Image
    ItemOpen Access
    Modeling the evaluative nature of German personal name compounds
    (2023) Deeg, Tana
    German personal name compounds such as Villen-Spahn (’villa-Spahn’), Gold-Rosi (’gold-Rosi’) and Folter-Bush (’torture-Bush’) are a rather infrequent phenomenon in the German language. They have the structure of determinative compounds and serve as a nickname for a usually well-known person. According to Belosevic (2022), personal name compounds are mostly evaluative, i.e. they evaluate the person behind the name in a positive or negative way. Further research on an evaluation across different groups of compounds (politics, showbusiness, sports) is proposed. This work will investigate the evaluative nature of 413 German personal name compounds that mostly have the structure of noun as modifier and last name as head. The 131 corresponding full names will be considered as well, e.g. Jens Spahn would correspond to Villen-Spahn. The context data of compounds and names was collected from Twitter and the Leipzig Corpora Collection. The valence value of these context words, based on a valence database of Köper and Schulte im Walde (2016), will be used to investigate the evaluative nature of compounds in comparison to their names. Furthermore, the relation to and function of the modifier will be examined. The valence values will then be used to verify whether there are noticeable differences between the groups of compounds. Afterwards, a linear regression will be implemented to predict a ’delta’ value: the difference between name valence and compound valence. Several predictor variables such as name valence, compound valence, modifier valence, age, gender, political party and nationality will be used. The results reveal that compounds are both positively and negatively evaluative in comparison to their full name while highlighting the reason why they were created. Compound valence and modifier valence are only partially correlated due to modifiers being involved rather accidentally or interpreted ironically. Lastly, noticable differences between the groups can be observed with politicians being the most negative group regarding their valence values. Conducting the linear regression with different combinations of predictor variables shows that compound valence is a highly significant predictor. Also, other variables such as modifier valence, age or political party are able to compose models that predict the delta value very well.
  • Thumbnail Image
    ItemOpen Access
    Cross-lingual citations in English papers : a large-scale analysis of prevalence, usage, and impact
    (2021) Saier, Tarek; Färber, Michael; Tsereteli, Tornike
    Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.
  • Thumbnail Image
    ItemOpen Access
    Distributional measures of semantic abstraction
    (2022) Schulte im Walde, Sabine; Frassinelli, Diego
    This article provides an in-depth study of distributional measures for distinguishing between degrees of semantic abstraction. Abstraction is considered a “central construct in cognitive science” (Barsalou, 2003) and a “process of information reduction that allows for efficient storage and retrieval of central knowledge” (Burgoon et al., 2013). Relying on the distributional hypothesis, computational studies have successfully exploited measures of contextual co-occurrence and neighbourhood density to distinguish between conceptual semantic categorisations. So far, these studies have modeled semantic abstraction across lexical-semantic tasks such as ambiguity; diachronic meaning changes; abstractness vs. concreteness; and hypernymy. Yet, the distributional approaches target different conceptual types of semantic relatedness, and as to our knowledge not much attention has been paid to apply, compare or analyse the computational abstraction measures across conceptual tasks. The current article suggests a novel perspective that exploits variants of distributional measures to investigate semantic abstraction in English in terms of the abstract-concrete dichotomy (e.g., glory-banana) and in terms of the generality-specificity distinction (e.g., animal-fish), in order to compare the strengths and weaknesses of the measures regarding categorisations of abstraction, and to determine and investigate conceptual differences. In a series of experiments we identify reliable distributional measures for both instantiations of lexical-semantic abstraction and reach a precision higher than 0.7, but the measures clearly differ for the abstract-concrete vs. abstract-specific distinctions and for nouns vs. verbs. Overall, we identify two groups of measures, (i) frequency and word entropy when distinguishing between more and less abstract words in terms of the generality-specificity distinction, and (ii) neighbourhood density variants (especially target-context diversity) when distinguishing between more and less abstract words in terms of the abstract-concrete dichotomy. We conclude that more general words are used more often and are less surprising than more specific words, and that abstract words establish themselves empirically in semantically more diverse contexts than concrete words. Finally, our experiments once more point out that distributional models of conceptual categorisations need to take word classes and ambiguity into account: results for nouns vs. verbs differ in many respects, and ambiguity hinders fine-tuning empirical observations.
  • Thumbnail Image
    ItemOpen Access
    Plug-and-play domain adaptation for neural machine translation
    (2023) Kadiķis, Emīls
    Neural machine translation has emerged as a powerful tool, yet its performance heavily relies on training data. In a fast-changing world, dealing with out-of-domain data remains a challenge, prompting the need for adaptable translation systems. While fine-tuning is a proven effective adaptation method, it is not always feasible due to data availability, memory, and computational constraints. This thesis introduces a dynamic plug-and-play method inspired by controllable text generation to enhance machine translation across various domains without fine-tuning. This method, called Plug-and-Play Neural Machine Translation (PPNMT), uses a mono-lingual domain-specific bag-of-words to push the hidden state of the decoder through backrpopogation, making the output more in-domain. The method is tested on two types of domains: formality, gender (where the source language does not make a distinction between these aspects, but the target language does), and fine-grained technical domains (which are more based on topic inherent in the text on both the source and target sides). The method performs reasonably well for adapting the translation to different formality levels and, to a lesser extent, grammatical genders, even with an incredibly simple bag-of-words. However, it struggles with adapting the model to technical domains, and a fine-tuning baseline outperforms the proposed method in anything but very low few-shot settings in all tried domains. Despite that, the method shows some interesting behaviour, adapting to the formality on a level that goes beyond just using formal pronouns.
  • Thumbnail Image
    ItemOpen Access
    Automatic classification of abstractness in English rigid nouns
    (2023) Saponaro, Alberto
    The main difference between (i) Mass-Count Languages (such as English) and (ii)Classifiers Languages (such as Chinese) is that (i) encode the information about nouns’ countability in their grammar and (ii) employ a classification system of classifiers to distinguish between individuals or substance. If the mass-count distinction is a characteristic of mass-count language, the substance-individuals denotation seems to be a concept universally available for all humans. Another concept that appears to be universally accessible and linked to the countability status of English nouns is the notion of abstractness. Then, mass nouns usually refer to an abstract object, and this is confirmed from the distribution of abstractness in the dataset. This thesis’ objective is to provide a model for the classification of rigid nouns (count or mass only) that is capable to generalize on the degree of abstractness. Additionally, it tests if a model trained with the same set of features is capable of rating the abstractness of those nouns. To accomplish these tasks, several sets of features are being identified based on syntactic and semantic properties of nouns that describe the mass-count distinction. The results indicate that the first model M1, a mass-count classifier that predicts the countability class of a rigid noun, provides reliable predictions and can generalize on the degree of abstractness of the targets. The second model M2, an abstractness rate predictor that assigns an abstractness rate from 1 to 5 to a rigid noun, is incapable of providing reliable ratings and cannot generalize on the countability status of the targets. A third model M3, an abstract-concrete (binary) classifier that predicts the abstractness class of a rigid noun, provides reliable predictions and can generalize on the countability status of the targets. Given that those results concerns rigid nouns only, further research can be conducted by examining the abstractness of elastic nouns. However, there is the need of an annotation that rates abstractness of nouns senses.
  • Thumbnail Image
    ItemOpen Access
    Gender bias in dependency parsing
    (2023) Go, Paul Stanley
    Recent high-profile advances in natural language processing (NLP) have spurred interest into identifying and rectifying socially harmful problems common in NLP systems such as gender bias. Unfortunately, many works which attempt to tackle the issue of gender bias suffer from methodological deficiencies such as the assumption of a binary and immutable concept of gender. We scrutinize one such work which found gender bias in dependency parsing and evaluate if the claims have merit. Our results were inconsistent with the gender bias findings of that paper, and further investigations through error analysis and treebank analysis revealed methodological flaws which artificially introduced differences between their female and male data sets. Mistakes made during preprocessing compromised the outcome; therefore, their results do not prove the existence of gender bias in dependency parsing. Through our findings, we suggest a different methodology for identifying and alleviating syntactic bias that is more inclusive for everyone-no matter their gender.