Natural Language Processing

Fall 2020 — CMSC 473/673

Announcements
Who, What, When, and Where

Check out the syllabus for all this information, including policies on academic honesty, accomodations, and late assignments.

Meeting Times
Synchronous remote, recorded lectures (check Piazza for the links)
Monday & Wednesday, 1pm - 2:15pm
Instructor
Frank Ferraro
ferraro [at] umbc [dot] edu
ITE 358/remote
Monday 2:15 - 3pm
Thursday 11:00 - 11:30
by appointment
TA
Charu Sharma
charus2 [at] umbc [dot] edu
Office hours TBD
by appointment
Topics
The topics covered will include
  • probability, classification, and the efficacy of simple counting methods
  • language modeling (n-gram models, smoothing heuristics, maxent/log-linear models, and distributed/vector-valued representations)
  • sequences of latent variables (e.g., hidden Markov models, some basic machine translation alignment)
  • trees and graphs, as applied to syntax and semantics
  • some discourse-related applications (coreference resolution, textual entailment), and
  • special and current topics (e.g., fairness and ethics in NLP).
Goals
After taking this course, you will
  • be introduced to some of the core problems and solutions of NLP;
  • learn different ways that success and progress can be measured in NLP;
  • be exposed to how these problems relate to those in statistics, machine learning, and linguistics;
  • have experience implementing a number of NLP programs;
  • read and analyze research papers;
  • practice your (written) communication skills.
Schedule

The following schedule of topics is subject to change.

Legend:

Date Topic Suggested Reading Assignment Out Assignment Due
Monday, 8/31
  1. Intro/administrivia
  2. What is NLP?
2SLP: Ch. 1:
  • [473] up through Ch 1.4
  • [673] entire chapter

[473/673] Assignment 1

Wednesday, 9/2
  1. Probability Review
[473/673] Hal Daume III's "Math for Machine Learning" primer
Wednesday, 9/9
  1. NLP Task Overview: part-of-speech, dependency parsing, textual entailment, and featurization
Cambria and White (2014)
  • [473] Sections 1, 3, 4, 7, 8
  • [673] Entire paper
Assignment 1
Monday, 9/14 [473/673] Assignment 2
Wednesday, 9/16
  1. Intro to ML: the Noisy Channel Model, Classification, & Evaluation
[473/673] 3SLP: Ch 4.0, 4.1, 4.7, 4.8
Monday, 9/21 [Whiteboard] [673] Graduate Assessment
Wednesday, 9/23 catch-up
Monday, 9/28 [Whiteboard]
  1. Count-based Language Modeling
[473/673] 3SLP Ch 3 (2SLP Ch 4) Assignment 2
Wednesday, 9/30 [Whiteboard] [473/673] Assignment 3
Monday, 10/05 [Whiteboard]
  1. Maximum-entropy Models (Log-linear models) and Neural Language Models
[673] Grad Assessment Milestone 1
Wednesday, 10/07 [Whiteboard]
Monday, 10/12 [Whiteboard]
Wednesday, 10/14 [Whiteboard]
Monday, 10/19 [Whiteboard]
  1. Distributed Representations
[473/673] Assignment 4
Wednesday, 10/21
Monday, 10/26 [Whiteboard]
Wednesday, 10/28 [Whiteboard]
  1. Recurrent Neural Models: Language Models, and Sequence Prediction and Generation
Monday, 11/02 [Whiteboard]
Wednesday 11/04 [Whiteboard]
Monday, 11/09 [Whiteboard]
  1. (Even More) Language Modeling: Multi-Task Learning, Attention, and Transformer-based Language Models
Take-home midterm out: due 11/13
Wednesday, 11/11 [Whiteboard]
(and starting HMMs; see 11/16)
Monday, 11/16 [Whiteboard]
  1. Hidden Markov Models
Wednesday, 11/18 [Whiteboard]
Monday, 11/23 [Whiteboard]
Wednesday, 11/25 [Whiteboard]
  1. Probabilistic Context Free Grammars, and CKY-based Algorithms
Monday, 11/30 [Whiteboard]
Wednesday, 12/02 [Whiteboard]
  1. Syntax to Semantics: Dependency Grammars/Parsing, Open IE, and Semantic Represenations/SRL
  2. Course Recap
Monday, 12/07