Skip to main content

This course covers foundational concepts in computational text analysis. The course is designed for Computer Science students interested in using text analysis methods to discover and measure concepts and phenomena in large amounts of text. Topics include core computational text analysis concepts, text-based machine learning, deep learning, basic statistical methods, and data collection. The course will culminate around research projects where groups of students will formulate and iteratively refine an empirical question; collect relevant textual data; implement appropriate methods of analysis; and interpret and present their results.

Course number
CMSC B383 - students from all majors are welcome!
Adam Poliak
Discussion Forum
Time and place
Spring 2023, MW 10:10-11:30am, Location: Park 338
Lab M: 11:40am-1:00pm
Office Hours
CMSC 151 Data Structures (or equivalent)
CMSC 231 Discrete Math (or equivalent)
Course Readings
Each lecture has an accompanying reading that will be posted to the schedule. Some lectures will have accompanying optional reading related to the lecture’s topic.
Many of the accompanying readings will be from the following freely available textbooks:
  1. Jurafsky and Martin, Speech and Language Processing (3rd ed. draft) (online copy)
  2. Text Analysis in Python for Social Scientists. Digital copies are available to download with your BMC/HC login from the Tripod (TriCollege Libraries).


  • Homeworks: 30%
  • Weekly Reading Reviews: 10%
  • Midterm: 20%
  • Project: 35%
  • Participation: 5%
Late day policy
To account for issues that arise in these uncertain times, each student has 10 late days for the homeworks and projects.
See the Policies for more details.