This course aims to provide theoretical foundations and practical
experience in distributed algorithms. The techniques covered in this
course have wide application. Examples will be drawn from speech and
language processing, machine learning, optimization, and graph
theory. The course will be a combination of:
Introductory lectures
Reading discussions: Students will take turns presenting papers
and will be responsible for up to 2 papers.
Assignments
In-Class discussion of assignment solutions by students.
Course project: There will be no final exam. Instead, the course
requires a final project of interest to student, chosen in
consultation with the instructor. The project requires a written
report and a final presentation. In many cases, the data, software
toolkit, and key components for the project will be made
available.
Prerequisites: A graduate level course on machine learning or
probability and statistics. Students should be comfortable coding in
at least one programming language.
Session 2 (4/5): Inverted Index, Language for Machine
Translation, Multicore MapReduce (Zak, Richard, Masoud & Aaron)
Review of Google File System
Review of MapReduce Framework
Large Language Models in Machine
Translation, Thorsten Brants et. al., Proc. Joint Conference
on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning (EMNLP), pg. 858-867,
2007.
Ensemble Nystrom Method,
S. Kumar et. al., Proc. Neural Information Processing Systems
(NIPS), 2010, Winner of Best Student Paper at the New York Academy
of Sciences 2009 Symposium on ML.