Class Schedule

Week 1 – 31 August

  • Intro to Computational Linguistics
  • Intro to course logistics
  • Stream – collaborative decision
  • Text Analysis for Digital humanities (see Resources for Digital Humanities)

Week 2 – 7 September

Week 3 – 14 September

Week 4 – 21 September

  • Discussion of Morphology long-term HW – Finite State Transducers and Morphology (link to write-up)
  • Discussion of Joint Entropy of Rotokas  and other langs (Please compute this before class)
  • Southern Brazilian Portuguese FST  (15 min)
  • English Plurals FST   (15 min)
  • Unsupervised Morphology (20 min)
  • Demo – Naive Bayes Classifier (zip file of classifier)
  • Geographic Classification of Arabic.
  • HW for next week (can do with a partner):
    • try the classifier on your texts.
    • try Linguistica on a Foreign Language Corpus. Analyze results.
    • skim Parts of Speech and Basic Syntax chapters of book mentioned in Week 5.

Week 5 – 28 September

  • What we learned from 5 million books – TedTalk
  • Wrap up and discussion
    • Entropy – interested in hearing reports and impressions from people
    • Southern Brazilian Portuguese
    • English Plurals
  • Linguistica – discussion
  • Introduction to Lexc – Part of the Xerox Toolkit (pdf of relevant chapter in Finite State Morphology (51MB dl))
  • Linguists in class:
    • Parts of Speech  (any intro to Linguistics book or through 5.3 of this chapter)  user:compling/pw – a 1337 version of chomsky (only 1 character changed)  Mary
    • Basic Syntax   (any intro to Linguistics book or chapter 12 of  this) Shannon
    • Phrase Structure Grammar  and parsing (chapter 13 of  this)

Week 6 – 5 October

Week 7 – 12 October

Week 8 – 19 October

  • Game plan for rest of semester
  • Introduction to the Final Project (pdf of slides)
  • Great Classifier ShootOut
  • BUBs parser lab
  • Statistical Machine Translation I: Word Alignment Models (Kevin Knight’s A Statistical MT Tutorial Workbook (rtf)
  • Language Model Optional Project (description)

Week 9 – 26 October

Week 10 – 2 November

  • Project update reports
  • Discussion of Manning Video
  • Information Extraction presentation (chapter 22 intro through 22.1 Named Entity Extraction of this draft same pw as above) student presenter 10 minutes. Gray & Eric!
  • Information Extraction presentation (chapter 22 —  section 22.2 of this draft)  Joe F and Jacob B!
  • Introduction to Information Extraction Technology (old paper but still relevant) Amy Olson & Samantha Whay
  • 1/2 hr. to co-ordinate project
  • Odds and Ends re. Satistical MT
  • Evaluating MT systems
  • Hands-on MT lab  (going through basic tutorial (sec 2.1) of Moses Statistical Machine Translation System User Manual.

Week 11 – 9 November

Week 12 – 16 November

  • Sprint 2 demo (5 – 10 minutes each team)
  • Computers versus Common Sense (The Cyc Project) Doug Lenat
  • Semantic Role Labeling (if people need it, can make this a student presentation, let me know)

Week 13 – 23 November

  • Thanksgiving Break

Week 14 – 30 November

  • Cyc Discussion
  • OpenCyc
  • Open Cog Project (talk by Ben Goertzel)
  • Formal Computational Semantics
  • Soft AI: what we have now: Powerset Demo Video
  • Hard AI: Embodied Conversational Agents – Justine Cassell’s research
  • presentation: The MIT START question answering system. Draw presentation from several papers at http://groups.csail.mit.edu/infolab/publications/ and give demo. (2 presenters)

Week 15 – 7 December

  • Final Sprint Demo – 20 minute presentations