Week 1 – 31 August
- Intro to Computational Linguistics
- Intro to course logistics
- Stream – collaborative decision
- Text Analysis for Digital humanities (see Resources for Digital Humanities)
Week 2 – 7 September
- Discussion of the following articles:
- Text Analysis Continued
- Discussion of results with tools
- Introduction to Morphology (PDF of slides)
- Introduction to Finite State Transducers
Week 3 – 14 September
- Southern Brazilian Portuguese Pronunciation
- How Natural Language Processing is Changing – a video
- Entropy and Entropy Worksheet on Rotokas.
- Discussion of the debate between Peter Norvig, Director of Research at Google and Noam Chomsky.
- Discussion of the following article:
- ToDo: Download and install Linguistica and test it on how well it learns morphology of a corpus of English that you find and a corpus of another language.
- Computational Morphology continued
Week 4 – 21 September
- Discussion of Morphology long-term HW – Finite State Transducers and Morphology (link to write-up)
- Discussion of Joint Entropy of Rotokas and other langs (Please compute this before class)
- Southern Brazilian Portuguese FST (15 min)
- English Plurals FST (15 min)
- Unsupervised Morphology (20 min)
- Demo – Naive Bayes Classifier (zip file of classifier)
- Geographic Classification of Arabic.
- HW for next week (can do with a partner):
- try the classifier on your texts.
- try Linguistica on a Foreign Language Corpus. Analyze results.
- skim Parts of Speech and Basic Syntax chapters of book mentioned in Week 5.
Week 5 – 28 September
- What we learned from 5 million books – TedTalk
- Wrap up and discussion
- Entropy – interested in hearing reports and impressions from people
- Southern Brazilian Portuguese
- English Plurals
- Linguistica – discussion
- Introduction to Lexc – Part of the Xerox Toolkit (pdf of relevant chapter in Finite State Morphology (51MB dl))
- Linguists in class:
- Parts of Speech (any intro to Linguistics book or through 5.3 of this chapter) user:compling/pw – a 1337 version of chomsky (only 1 character changed) Mary
- Basic Syntax (any intro to Linguistics book or chapter 12 of this) Shannon
- Phrase Structure Grammar and parsing (chapter 13 of this)
Week 6 – 5 October
Week 7 – 12 October
- Great Classifier ShootOut
- Parsing
Week 8 – 19 October
- Game plan for rest of semester
- Introduction to the Final Project (pdf of slides)
- Great Classifier ShootOut
- BUBs parser lab
- Statistical Machine Translation I: Word Alignment Models (Kevin Knight’s A Statistical MT Tutorial Workbook (rtf)
- Language Model Optional Project (description)
Week 9 – 26 October
Week 10 – 2 November
- Project update reports
- Discussion of Manning Video
- Information Extraction presentation (chapter 22 intro through 22.1 Named Entity Extraction of this draft same pw as above) student presenter 10 minutes. Gray & Eric!
- Information Extraction presentation (chapter 22 — section 22.2 of this draft) Joe F and Jacob B!
- Introduction to Information Extraction Technology (old paper but still relevant) Amy Olson & Samantha Whay
- 1/2 hr. to co-ordinate project
- Odds and Ends re. Satistical MT
- Evaluating MT systems
- Hands-on MT lab (going through basic tutorial (sec 2.1) of Moses Statistical Machine Translation System User Manual.
Week 11 – 9 November
Week 12 – 16 November
- Sprint 2 demo (5 – 10 minutes each team)
- Computers versus Common Sense (The Cyc Project) Doug Lenat
- Semantic Role Labeling (if people need it, can make this a student presentation, let me know)
Week 13 – 23 November
Week 14 – 30 November
- Cyc Discussion
- OpenCyc
- Open Cog Project (talk by Ben Goertzel)
- Formal Computational Semantics
- Soft AI: what we have now: Powerset Demo Video
- Hard AI: Embodied Conversational Agents – Justine Cassell’s research
- presentation: The MIT START question answering system. Draw presentation from several papers at http://groups.csail.mit.edu/infolab/publications/ and give demo. (2 presenters)
Week 15 – 7 December
- Final Sprint Demo – 20 minute presentations