CO 504 : Natural Language Processing

Course Plan for Autumn 2010

L-T-P : 3-0-0
Credits : 3

Evaluation Plan

Class test 20
Assignments
(including lab work)
50
Term-paper 30
Mid-term test 50
End-term test 100
Total 250

Lesson Plan

Sl.No. Topic Contact hours
1. Introduction
Human languages, models, ambiguity, processing paradigms
Phases in Natural Language Processing
Text representation in computers, encoding schemes
 
1
1
2
2. Regular expressions, FSA, word recognition, Lexicon 1
3. Morphology, Acquisition models, FST 2
4. N-grams, smoothing
Entropy
3
5. POS tagging, Stochastic POS tagging, HMM
Transformtion based tagging (TBL)
Issues
4
6. CFG, spoken language syntax, word order 2
7. Parsing
Unification
Probabilistic parsing
Treebank
6
8. Semantics, Meaning representation 2
9. Semantic Analysis
Lexical semantics
WordNet
Summarization
5
10. WSD
Selectional restriction
Machine learning approaches, dictionary based approaches
IR
Vector space model, term weighting, Homonymy, Polysemy, synonymy
Improving user queries
3
3
11. Discourse
Reference resolution, constraints on coreference, algorithm for pronoun resolution
Text Coherence
Discourse structure
4
12. Generation - Overview 1
13. Machine Translation - Overview 2
Total 42

Some term paper topics

  1. Paninian Grammar
  2. Paninian Parser
  3. Karaka Theory
  4. Anusaraka System
  5. Lexical Functional Grammar
  6. Tree Adjoining Grammar
  7. Government and Binding

Books

  1. Jurafsky D., and Martin J H. . Speech and Natural Language Processing, 2e, Pearson Education

    References:

  2. James A.. Natural language Understanding 2e, Pearson Education
  3. Bharati A., Sangal R., Chaitanya V.. Natural language processing: a Paninian perspective , PHI
  4. Siddiqui T., Tiwary U. S.. Natural language processing and Information retrieval , OUP