CSE 562/662 - Natural Language Processing

Instructor: Brian Roark

Class time: M 5:30 PM - 8:30 PM    Jan. 8 - Mar. 19, 2007

Class location: Jack Murdoch Building - Room 561

Office hours: Th 10-12, Central 115, or by appointment

Required textbook: None. We will be using selected chapters from:
Jurafsky and Martin Speech and Language Processing, 2nd edition
(not yet available in print, links to updated chapters on-line below)

Skip to overview of lectures.

Goals

The goal of this course is to give a broad but detailed introduction to the key algorithms and modeling techniques used for Natural Language Processing (NLP) today. With a few exceptions, NLP involves taking a sequence of words as input (e.g. a sentence) and returning some annotation(s) for that string. Well-known examples of this include part-of-speech tagging and syntactic parsing. Many other common tasks, e.g. shallow parsing or named-entity recognition, can be easily recast as tagging tasks; hence certain basic techniques can be widely applied within NLP. Applications such as automatic speech recogntion, machine translation, information extraction, and question answering all make use of NLP techniques. By the end of this course, you should understand how to approach common natural language problems arising in these and other applications.

Prerequisites

CSE 560/660 - Artificial Intelligence is a prerequisite for this course. There is no official programming language for this course, but there will be a fair amount of programming required to complete assignments, hence facility with some programming language is assumed.

Grading

10% of your grade will depend on in-class discussion, 50% on the homeworks and 20% each on the midterm and final.

What we'll cover and an approximate schedule

Roughly speaking, half of the course will be devoted to finite-state methods, and half to context-free methods (or beyond). Algorithms for annotating linguistic structure will always be presented with statistical variants, which provide the basis for disambiguation.

I have listed tentative reading assignments from the Jurafsky and Martin (J&M) textbook that are associated with each section. Note that the new 2nd edition will not be available until summer, 2007, but updated chapters are posted on-line. Links are listed below. The lecture topics should be taken as a rough guide, as the schedule will almost certainly change over the term.

Date     Topic Reading AssignmentFAQs
Jan 8(a) Introduction to NLP; Applications using NLP; Chomsky hierarchy Handouts;
J&M Ch. 4
HW 1FAQ1
(b) Weighted finite-state automata and transducers; n-grams and smoothing
Jan 12 Optional tutorial on regular expression processing,
1pm in Murdoch 561
     
Jan 22(a) Finite-state Morphology; Phonology and Pronunciation Handouts;
J&M Ch. 3;
J&M Ch. 10
HW 2FAQ2
(b) N-grams and smoothing (cont.); grammar and lexicon composition
Jan 29 (a) POS-tagging and Class-based language modeling Handouts;
C&G, 1998
MP&R, 2001
J&M Ch. 5
HW 3FAQ3
(b) Expectation Maximization (EM) for smoothing and word-classes
Jan 31 Optional tutorial on parsing/tagging data structures, location/time TBA      
Feb 5(a) Machine learning for tagging MC, 2002
S&P, 2003
J&M Ch. 11
HW 4FAQ4
(b) Context-free grammars (CFGs); treebanks; probabilistic CFGs
Feb 12 Midterm Exam Handouts;
J&M Ch. 12
HW 5FAQ5
(a) Context-free parsing (introduction); CYK
(b) Context-free parsing (continued); Earley, Shift-reduce, Top-down
Left-corner; grammar transformations
Feb 26(a) Context-sensitive grammars; unification Handouts   
(b) Tree-adjoining grammar; Categorial grammar
Mar 5(a) Statistical parsing approaches (1) Handouts HW 6FAQ6
(b) Statistical parsing approaches (2); and finite-state parsing
Mar 12(a) Lexical semantics and word statistics J&M Ch. 19 HW 7FAQ7
(b) Word sense disambiguation; Anaphora resolution; NP coreference
Mar 19 Final Exam      


C&G, 1998   Stan Chen and Joshua Goodman. An Empirical Study of Smoothing Techniques for Language Modeling. Harvard University Technical Report TR-10-98, 1998.
MC, 2002   Michael Collins. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP), 2002.
MP&R, 2001  Mehryar Mohri, Fernando Pereira and Michael Riley. Weighted Finite-State Transducers in Speech Recognition. Computer Speech & Language, 16(1), pp. 69-88, 2001.
S&P, 2003   Fei Sha and Fernando Pereira. Shallow parsing with conditional random fields. In Proceedings of HLT-NAACL 2003, pages 213-220.