Research.
Broadly, the areas I work in are: Computational linguistics, lexical semantics, statistical models, automatic wide-coverage semantic analysis, lexical acquisition, corpus linguistics.
Here are a few topics I am studying:
- Vagueness in word meaning
I am looking at models for representing word sense that support graded membership, based on concept membership models from psychology. I am especially interested in models that support lexical compositionality, i.e. the compositional computation of the meaning of an adjective/noun pair or a predicate/argument pair from the meanings of the individual words.
- Lexical acquisition: selectional preferences
I have been working on a simple method for inducing selectional preferences from corpus data, on the one hand as features that may improve semantic role labeling, on the other hand asking how well they model human plausibility ratings.
- Automatic lexical semantic analysis
I am one of the main developers of the Shalmaneser system, which automatically assigns word sense and semantic role labels (both in the FrameNet paradigm) to free text. In that context, I've been looking at the question of identifying occurrences of unknown senses, and error analysis for semantic role labeling systems.
Currently, we're working on extending Shalmaneser by a web interface, a GUI, and a workbench for the declarative definition of features for semantic role labeling.
- Computational approaches aiding language documentation
The EARL project asks how we can use active learning and cross-lingual projection to reduce the effort involved in documenting languages.
- Annotation
In the SALSA project, we annotated a German newspaper corpus with frame-semantic information, touching on questions of appropriate annotation formats, annotation tools, and querying corpora with several related levels of linguistic annotation. We also compared annotation paradigms between SALSA, FrameNet and PropBank.
Lately, I've been involved in the development of an annotation format for interlinearized glossed texts.