Computational Linguistics 2

Fall 2006 | Instructor: Jason Baldridge | Tuesday and Thursday, 2pm-3:30pm | PAR 308

This course is a sequel to Computational Linguistics 1. It addresses advanced techniques and applications of natural language processing (NLP). The course has three main aims: familiarity with tools and techniques for handling text corpora, knowledge of the characteristics of some of the available corpora, and a secure grasp of the fundamentals of statistical natural language processing. Specific objectives include:

  • understanding of probability and information theory as they have been applied to computational linguistics.
  • experience of working with corpora.
  • knowledge of some applications of statistical NLP, including n-gram language models, part-of-speech taggers, and probabilistic parsers
  • For practical exercises, we will mainly use existing toolkits, such as the NLTK, which will put us in a position to build non-trivial system components in a relatively short time.