Advanced Natural Language Processing
Recent break-throughs in AI in general, and Natural Language Processing in particular, are due to the extensive development and use of neural networks and deep learning. This course will cover 1) fundamentals of NLP (including, part-of-speech tagging, syntactic and semantic parsing, word sense disambiguation); 2) statistical methods used for NLP tasks (classification, feature selection); 3) fundamentals of neural networks and deep learning (gradient decent, back propagation, forward propagation, convolutional neural networks, recurrent neural networks, long-short-term-memory models); 4) application of deep learning techniques to natural language processing tasks (word vector representations, word embeddings, etc.).
This course satisfies the "Corpus Analysis" or "Advanced Natural Language Processing" requirement of the CUNY Graduate Center Computational Linguistics MA/PhD Certificate Program.
Recommendations for the students taking the class
- Proficiency in Python: All class assignments will be in Python (numpy, pandas, Keras). Jupiter Notebooks will be used for in-class examples and assignment submissions. Below are several links to Python-related tutorials.
o Python tutorial: https://docs.python.org/3/tutorial/
o NumPy tutorial: https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html
o Keras API: https://keras.io/
o Keras API and Python vectorization operations will be discussed in class.
- Completion of Language Technology or Natural Language Processing is strongly recommended.
o Speech and Language Processing (3rd edition) by Jurafsky and Martin on-line version: https://web.stanford.edu/~jurafsky/slp3/
- College Calculus, Linear Algebra: you should be comfortable taking derivatives and understanding matrix vector operations and notation.
- Basic Probability and Statistics.
The area of deep learning is neural nets is large and developing rapidly. The instructor can adjust the topics to students’ interest, and accordingly, the instructor can narrow or extend the list of topics below.
o how various NLP tasks are casted as classification problems;
o document categorization;
o machine translation;
o use of classification methods: Naïve Bayesian, SVM, logistic regression, etc.;
o feature selection for classification.
- Neural nets and deep learning:
o logistic regression example;
o gradient decent;
o back propagation;
o forward propagation;
o convolutional neural networks;
o recurrent neural networks;
o long-short-term-memory models.
- NLP-specific deep learning methods:
o word vector representations;
o word embeddings;
o sequence learning.
- Discussion of Deep Learning application in the field of NLP using papers published in NLP conferences (ACL, EMNLP, CONLL, etc.)
The class does not have either midterm or final exam. The assessment will be based on the three
completed homework assignments, final project which students present in class, class participation,
research papers discussion.