Please use this identifier to cite or link to this item:
Authors: Schulte im Walde, Sabine
Schmid, Helmut
Wagner, Wiebke
Hying, Christian
Scheible, Christian
Title: A clustering approach to automatic verb classification incorporating selectional preferences: model, implementation, and user manual
Issue Date: 2010 Arbeitspapier
Series/Report no.: SinSpeC - Working Papers of the SFB 732 "Incremental Specification in Context";7
Abstract: This report presents two variations of an innovative, complex approach to semantic verb classes that relies on selectional preferences as verb properties. The underlying linguistic assumption for this verb class model is that verbs which agree on their selectional preferences belong to a common semantic class. The model is implemented as a soft-clustering approach, in order to capture the polysemy of the verbs. The training procedure uses the Expectation-Maximisation (EM) algorithm (Baum, 1972) to iteratively improve the probabilistic parameters of the model, and applies the Minimum Description Length (MDL) principle (Rissanen, 1978) to induce WordNet-based selectional preferences for arguments within subcategorisation frames. One variation of the MDL principle replicates a standard MDL approach by Li and Abe (1998), the other variation presents an improved pruning strategy that outperforms the standard implementation considerably. Our model is potentially useful for lexical induction (e.g., verb senses, subcategorisation and selectional preferences, collocations, and verb alternations), and for NLP applications in sparse data situations. We demonstrate the usefulness of the model by a standard evaluation (pseudo-word disambiguation), and three applications (selectional preference induction, verb sense disambiguation, and semi-supervised sense labelling).
Appears in Collections:12 Sonderforschungs- und Transferbereiche

Files in This Item:
File Description SizeFormat 
sfb732_d4_cluster_report.pdf4,28 MBAdobe PDFView/Open

Items in OPUS are protected by copyright, with all rights reserved, unless otherwise indicated.