Browsing by Author "Bräuninger, Maximilian"

Now showing 1 - 2 of 2

Open Access
Characteristics of Neighbourhood Vector Spaces for Abstract and Concrete Words
(2019) Bräuninger, Maximilian
Concreteness and abstractness of words is an important concept in psycholinguistic and computer linguistic. Concrete words can be directly experienced using at least one of the five human senses. Abstract words cannot be experienced directly but have to be described using other words. According to the Context Availability Theory, in order to evoke the meaning of a word it is necessary to create an appropriate context. The theory suggests differences between the concepts of concrete and abstract words. Computational studies have shown evidence regarding the statement of the Context Availability Theory as concrete words seem to appear in specific, small set of contexts, whereas abstract words appear in a broader more general context. Further pursuing this hypothesis, several different neighbourhood similarity measures are used on five different vector space models representing the neighbourhood of concrete and abstract target words. This work presents an in-depth discussion and analysis of characteristics of both concrete and abstract target words regarding their context dimensions as well as other neighbours. Furthermore two different dimensionality reduction techniques are used on the original vector space in order to produce low dimensional representations of the concrete and abstract neighbourhoods.
Open Access
Improving SMT-based synonym extraction across word classes by distributional reranking of synonyms and hypernyms
(2017) Bräuninger, Maximilian
Automatic Synonym Extraction is a promising field of research. For example it can be useful in the creation of Thesauri, aswell as in the creation and examination of automatic machine translation. This thesis tries to extract synonym candidates using "statistical machine translation" (SMT) methods combined with multilingual parallel corpora. This is done by the creation of "word alignments" within the parallel corpus. Using these alignments, in a first step the German target words, consisting of nouns, verbs and adjectives, are translated into English pivots. Using the same techniques, these pivots are then re-translated into German words. These translations are regarded as synonym candidates and are ranked according to their "synonym probability". In a second step two different distributional semantics measures are introduced in order to re-rank the synonym candidates. The first measure tries to identify the semantical relation between the words, especially the hyperonomy, and rank hypernyms lower in the candidate list. The second measure relies on the semantical similarity of the words, ranking semantically equivalent words higher in the list. In a last step, the results are compared with regard to word class aswell as re-ranking strategy using a gold standard.