Schedule

Schedule as of September 24, 2007. Subject to change.

Assignments are due by class time (12:30pm) on their due date.

Week Dates Topic Slides Readings Assignments
1 Aug 30 Course overview; assignments, projects;
Empirical NLP
M&S, ch. 1, pp 3-19 (read this after class)
2 Sep 4 Statistical methods in Linguistics
Probability Theory
Steve Abney. Statistical Methods and Linguistics. 1996.
M&S, ch. 2, pp 40-45
  Sep 6 Probability Theory (continued) M&S, ch. 2, pp 45-60
3 Sep 11 Information Theory M&S, ch. 2, pp 60-80
  Sep 13 Working with Corpora Slides: annotation formats
Slides: searching corpora
M&S, ch. 1, pp 19-35
Brew and Moens, Chapter 3

Read after class: John Sinclair on making corpora.

4 Sep 18 Searching corpora
Classification: applications,
Experimental design: Training, testing, evaluation methodology, lab books
M&S, pp. 229-239, pp. 267-271, pp. 575-578.
  Sep 20 Naive Bayes Classification, Add-one smoothing Slides: Classification and experimental design. Reading on Naive Bayes:
Mitchell (handout), pp. 177-184

Reading on classification and experimental design:
Joakim Nivre. On Statistical Methods in Natural Language Processing. 2002. (Skim. No need to go through the examples in detail.)
Chris Callison-Burch and Miles Osborne. Statistical Natural Language Processing. 2003.

5 Sep 25 Naive Bayes Classification, Add-one smoothing Due: Homework 1
  Sep 27 Memory-based learning Mitchell: Machine Learning, pp. 230-236 (handout)
6 Oct 2 Discussion of project ideas
Language modeling: n-grams
Note: Discussion of project ideas today!
  Oct 4 Smoothing n-grams Slides: language models
Jurafsky and Martin ch. 4 (from the second edition!)
M&S, ch. 6, pp 191-224
7 Oct 9 Guest lecture Jason Baldridge:
Hidden Markov Models
Jurafsky and Martin, 2nd Ed., Chapter 6, Sections 6.1 - 6.5 (p. 1-21) homework due date CHANGED!
  Oct 11 Guest lecture Jason Baldridge:
The forward-backward algorithm
Due: Homework 2
8 Oct 16 Guest lecture Jason Baldridge:
MEMMs; Building POS taggers
Jurafsky and Martin, 2nd Ed., Chapter 6, Section 6.8 (p. 36-40)
  Oct 18 Probabilistic parsing M&S, ch. 11 Due: Project proposal
9 Oct 23 Probabilistic parsing M&S, ch. 12
  Oct 25 Probabilistic parsing: lexicalized PCFGs Jurafsky &Martin, 2nd edition, ch. 14 sections 14.5 and 14.6
10 Oct 30 Clustering: k-means and hierarchical M&S ch. 14 pp. 495-509, 512-518
Sabine Schulte im Walde's thesis, pp. 193-201 (evaluation measures)

Papers mentioned in class:
Schulte im Walde and Brew on clustering verbs into semantic classes: ACL 02, and EACL 03
Joanis and Stevenson on the same topic: EACL 03 and CoNLL 03
Marina Meila on comparing clusterings: technical report

  Nov 1 Clustering: EM M&S ch. 14 pp. 518-524
Rooth et al, ACL 1999
11 Nov 6 Clustering: graph-based van Dongen: A cluster algorithm for graphs, Technical Report INS-R0010, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam.
See also: C. Biemann, Chinese Whispers, Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06
Due: Homework 3
  Nov 8 Word sense disambiguation M&S chapter 7, pp. 241-252, 257-260
12 Nov 13 Semantic role labeling Some relevant papers:
Palmer et al, 2005: The proposition bank: An annotated corpus of semantic roles.
Baker, Fillmore and Lowe, 1998: The Berkeley FrameNet project
Ellsworth et al, 2004: PropBank, SALSA, and FrameNet: How design determines product.
Gildea and Jurafsky 2002: Automatic Labeling Of Semantic Roles
Xue and Palmer, 2004: Calibrating Features for Semantic Role Labeling
Pradhan, Ward and Martin 2007: Towards Robust Semantic Role Labeling
  Nov 15 Using the WordNet hierarchy: selectional preferences Due: Project progress report
13 Nov 20 Vector space semantics
  Nov 22 Thanksgiving
14 Nov 27 Machine translation
  Nov 29 12:00 Talk: Ken McRae
Seay 4.242
Note earlier starting time!
15 Dec 4 Project presentations 12:30 Kunal Khatua
12:45 Jeff Rego
1:00 Sudipta Chatterjee
1:15 Andrew Harp
  Dec 6 Project presentations 12:30 David Chen
12:45 Harivardan Jayaraman
1:00 Joey Frazee
1:15 Travis Brown
1:30 Trevor Fountain
Due: Homework 4
  Dec 7 Due: Project report