Schedule
Schedule as of September 24, 2007. Subject to change.
Assignments are due by class time (12:30pm) on their due date.
| Week | Dates | Topic | Slides | Readings | Assignments |
| 1 | Aug 30 | Course overview; assignments, projects; Empirical NLP |
M&S, ch. 1, pp 3-19 (read this after class) | ||
| 2 | Sep 4 | Statistical methods in
Linguistics Probability Theory |
Steve Abney. Statistical
Methods and Linguistics. 1996.
M&S, ch. 2, pp 40-45 |
||
| Sep 6 | Probability Theory (continued) | M&S, ch. 2, pp 45-60 | |||
| 3 | Sep 11 | Information Theory | M&S, ch. 2, pp 60-80 | ||
| Sep 13 | Working with Corpora |
Slides: annotation
formats Slides: searching corpora |
M&S, ch. 1, pp 19-35
Brew and Moens, Chapter 3 Read after class: John Sinclair on making corpora. |
||
| 4 | Sep 18 | Searching corpora Classification: applications, Experimental design: Training, testing, evaluation methodology, lab books |
M&S, pp. 229-239, pp. 267-271, pp. 575-578. | ||
| Sep 20 | Naive Bayes Classification, Add-one smoothing | Slides: Classification and experimental design. | Reading on Naive Bayes:
Mitchell (handout), pp. 177-184
Reading on classification and experimental
design: |
||
| 5 | Sep 25 | Naive Bayes Classification, Add-one smoothing | Due: Homework 1 | ||
| Sep 27 | Memory-based learning | Mitchell: Machine Learning, pp. 230-236 (handout) | |||
| 6 | Oct 2 | Discussion of project ideas Language modeling: n-grams |
Note: Discussion of project ideas today! | ||
| Oct 4 | Smoothing n-grams | Slides: language
models |
Jurafsky
and Martin ch. 4 (from the second edition!) M&S, ch. 6, pp 191-224 |
||
| 7 | Oct 9 | Guest lecture Jason Baldridge: Hidden Markov Models |
Jurafsky and Martin, 2nd Ed., Chapter 6, Sections 6.1 - 6.5 (p. 1-21) | homework due date CHANGED! | |
| Oct 11 | Guest lecture Jason Baldridge: The forward-backward algorithm |
Due: Homework 2 | |||
| 8 | Oct 16 | Guest lecture Jason Baldridge: MEMMs; Building POS taggers |
Jurafsky and Martin, 2nd Ed., Chapter 6, Section 6.8 (p. 36-40) | ||
| Oct 18 | Probabilistic parsing | M&S, ch. 11 | Due: Project proposal | ||
| 9 | Oct 23 | Probabilistic parsing | M&S, ch. 12 | ||
| Oct 25 | Probabilistic parsing: lexicalized PCFGs | Jurafsky &Martin, 2nd edition, ch. 14 sections 14.5 and 14.6 | |||
| 10 | Oct 30 | Clustering: k-means and hierarchical | M&S ch. 14 pp. 495-509, 512-518 Sabine Schulte im Walde's thesis, pp. 193-201 (evaluation measures)
Papers mentioned in class: |
||
| Nov 1 | Clustering: EM | M&S ch. 14 pp. 518-524 Rooth et al, ACL 1999 |
|||
| 11 | Nov 6 | Clustering: graph-based | van Dongen: A
cluster algorithm for graphs, Technical Report INS-R0010,
National Research
Institute for Mathematics and Computer Science in the
Netherlands, Amsterdam. See also: C. Biemann, Chinese Whispers, Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06 |
Due: Homework 3 | |
| Nov 8 | Word sense disambiguation | M&S chapter 7, pp. 241-252, 257-260 | |||
| 12 | Nov 13 | Semantic role labeling | Some relevant papers: Palmer et al, 2005: The proposition bank: An annotated corpus of semantic roles. Baker, Fillmore and Lowe, 1998: The Berkeley FrameNet project Ellsworth et al, 2004: PropBank, SALSA, and FrameNet: How design determines product. Gildea and Jurafsky 2002: Automatic Labeling Of Semantic Roles Xue and Palmer, 2004: Calibrating Features for Semantic Role Labeling Pradhan, Ward and Martin 2007: Towards Robust Semantic Role Labeling |
||
| Nov 15 | Using the WordNet hierarchy: selectional preferences | Due: Project progress report | |||
| 13 | Nov 20 | Vector space semantics | |||
| Nov 22 | Thanksgiving | ||||
| 14 | Nov 27 | Machine translation | |||
| Nov 29 12:00 | Talk: Ken McRae Seay 4.242 Note earlier starting time! |
||||
| 15 | Dec 4 | Project presentations | 12:30 Kunal Khatua 12:45 Jeff Rego 1:00 Sudipta Chatterjee 1:15 Andrew Harp |
||
| Dec 6 | Project presentations | 12:30 David Chen 12:45 Harivardan Jayaraman 1:00 Joey Frazee 1:15 Travis Brown 1:30 Trevor Fountain |
Due: Homework 4 | ||
| Dec 7 | Due: Project report |
