05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    ItemOpen Access
    Between welcome culture and border fence : a dataset on the European refugee crisis in German newspaper reports
    (2023) Blokker, Nico; Blessing, André; Dayanik, Erenay; Kuhn, Jonas; Padó, Sebastian; Lapesa, Gabriella
    Newspaper reports provide a rich source of information on the unfolding of public debates, which can serve as basis for inquiry in political science. Such debates are often triggered by critical events, which attract public attention and incite the reactions of political actors: crisis sparks the debate. However, due to the challenges of reliable annotation and modeling, few large-scale datasets with high-quality annotation are available. This paper introduces DebateNet2.0 , which traces the political discourse on the 2015 European refugee crisis in the German quality newspaper taz . The core units of our annotation are political claims (requests for specific actions to be taken) and the actors who advance them (politicians, parties, etc.). Our contribution is twofold. First, we document and release DebateNet2.0 along with its companion R package, mardyR . Second, we outline and apply a Discourse Network Analysis (DNA) to DebateNet2.0 , comparing two crucial moments of the policy debate on the “refugee crisis”: the migration flux through the Mediterranean in April/May and the one along the Balkan route in September/October. We guide the reader through the methods involved in constructing a discourse network from a newspaper, demonstrating that there is not one single discourse network for the German migration debate, but multiple ones, depending on the research question through the associated choices regarding political actors, policy fields and time spans.
  • Thumbnail Image
    ItemOpen Access
    Determinants of grader agreement : an analysis of multiple short answer corpora
    (2021) Padó, Ulrike; Padó, Sebastian
    The ’short answer’ question format is a widely used tool in educational assessment, in which students write one to three sentences in response to an open question. The answers are subsequently rated by expert graders. The agreement between these graders is crucial for reliable analysis, both in terms of educational strategies and in terms of developing automatic models for short answer grading (SAG), an active research topic in NLP. This makes it important to understand the properties that influence grader agreement (such as question difficulty, answer length, and answer correctness). However, the twin challenges towards such an understanding are the wide range of SAG corpora in use (which differ along a number of dimensions) and the hierarchical structure of potentially relevant properties (which can be located at the corpus, answer, or question levels). This article uses generalized mixed effects models to analyze the effect of various such properties on grader agreement in six major SAG corpora for two main assessment tasks (language and content assessment). Overall, we find broad agreement among corpora, with a number of properties behaving similarly across corpora (e.g., shorter answers and correct answers are easier to grade). Some properties show more corpus-specific behavior (e.g., the question difficulty level), and some corpora are more in line with general tendencies than others. In sum, we obtain a nuanced picture of how the major short answer grading corpora are similar and dissimilar from which we derive suggestions for corpus development and analysis.
  • Thumbnail Image
    ItemOpen Access
    Analysis of political debates through newspaper reports : methods and outcomes
    (2020) Lapesa, Gabriella; Blessing, Andre; Blokker, Nico; Dayanik, Erenay; Haunss, Sebastian; Kuhn, Jonas; Padó, Sebastian
    Discourse network analysis is an aspiring development in political science which analyzes political debates in terms of bipartite actor/claim networks. It aims at understanding the structure and temporal dynamics of major political debates as instances of politicized democratic decision making. We discuss how such networks can be constructed on the basis of large collections of unstructured text, namely newspaper reports. We sketch a hybrid methodology of manual analysis by domain experts complemented by machine learning and exemplify it on the case study of the German public debate on immigration in the year 2015. The first half of our article sketches the conceptual building blocks of discourse network analysis and demonstrates its application. The second half discusses the potential of the application of NLP methods to support the creation of discourse network datasets.