Following Kevin Knight's tradition, understand and implement followings:
- finite-state machines (weighted FSAs and FSTs)
- syntactic structures (weighted context-free grammars and parsing algorithms)
- machine learning methods (maximum likelihood and expectation-maximization)
- modern quantitative techniques in NLP that use large corpora and statistical learning
- various dynamic programming algorithms (Viterbi, CKY, Forward-Backward, and Inside-Outside)
- Japanese language as a running example to demonstrate the linguistic diversity, to illustrate transliteration and translation, and to understand the Viterbi and EM algorithms
- For the linguistic background of Japanese, please see this video.
- For finite-state toolkit, USC ISI's CARMEL is used.