Computational Linguistics 2

Fall 2007 | Instructor: Katrin Erk | Tuesday and Thursday, 12:30pm-2pm | PAR 308

This course is a sequel to Computational Linguistics I. It addresses advanced techniques and applications of natural language processing (NLP). The course has three main aims: familiarity with tools and techniques for handling text corpora, knowledge of the characteristics of some of the available corpora, and a secure grasp of the fundamentals of statistical natural language processing. Specific objectives include:

  1. understanding of probability and information theory as they have been applied to computational linguistics.
  2. experience of working with corpora.
  3. knowledge of some applications of statistical NLP, including n-gram language models, part-of-speech taggers, and probabilistic parsers.

For practical exercises, we will mainly use existing toolkits, such as the NLTK, which will put us in a position to build non-trivial system components in a relatively short time.