Lexical acquisition

Spring 2007 | Instructor: Katrin Erk | Tuesday and Thursday, 2pm-3:30pm | PAR 201

What are the selectional preferences (typical arguments) of a word? What idioms does it participate in? What other words are synonymous, or similar in meaning? Information like this is important in parsing, in semantics construction, and in drawing inferences from semantic representation. But it is very labor-intensive to specify a wide-coverage lexical knowledge base by hand.

A new approach that has emerged over the past few years is to derive such resources from corpora: Semantic information ranging from synonymy or hyponymy to rather complex verb relations can be learned with a surprising degree of success even from unannotated corpora.

These techniques also allow us to test linguistic theories: can semantic similarity be characterized solely in terms of syntactic properties, for example verb alternations? How flexible is the interpretation of metonymic expressions really?

The methods used to learn lexical information from corpora typically yield not categorical information, but graded judgments: two words may be judged more or less semantically similar, or an interpretation for a metonymic expression more or less appropriate. This leads us to the question whether the phenomena, like word sense similarity or metonymic interpretation, are best described categorically or in a graded fashion. What models are most appropriate for describing them in principle?

In this graduate seminar, we will discuss the most important recent approaches to learning semantic knowledge from corpora, study statistical methods used to test lexical semantic theories, and discuss the the categorical or graded nature of lexical semantic phenomena.

A solid understanding of statistical methods will be helpful, but not strictly necessary for students taking this class. We will be discussing both technical and theoretical papers, and students will have the opportunity of focusing more on the one or the other ''side'' of the readings.