This course covers foundational concepts in computational text analysis.
The course is designed for Computer Science students interested in using text analysis methods to discover and measure concepts and phenomena in large amounts of text. Topics include core computational text analysis concepts, text-based machine learning,
deep learning, basic statistical methods, and data collection.
The course will culminate around research projects where groups of students will formulate and iteratively refine an empirical question; collect relevant textual data; implement appropriate methods of analysis; and interpret and present their results.
- Course number
- CMSC B383 - students from all majors are welcome!
- Adam Poliak
- Discussion Forum
- Time and place
- Spring 2023, MW 10:10-11:30am, Location: Park 338
- Lab M: 11:40am-1:00pm
- Office Hours
- CMSC 151 Data Structures (or equivalent)
- CMSC 231 Discrete Math (or equivalent)
- Course Readings
- Each lecture has an accompanying reading that will be posted to the schedule. Some lectures will have accompanying optional reading related to the lecture’s topic.
- Many of the accompanying readings will be from the following freely available textbooks:
- Jurafsky and Martin, Speech and Language Processing (3rd ed. draft) (online copy)
- Text Analysis in Python for Social Scientists. Digital copies are available to download with your BMC/HC login from the Tripod (TriCollege Libraries).
- Homeworks: 30%
- Weekly Reading Reviews: 10%
- Midterm: 20%
- Project: 35%
- Participation: 5%
- Late day policy
- To account for issues that arise in these uncertain times, each student has 10 late days for the homeworks and projects.
See the Policies for more details.