Bryn Mawr College

CMSC 380: Information Retrieval

Spring 2020

Prof. Geoffrey Towell

Information Texts  Important Dates  Assignments  Lectures Course Policies Syllabus

General Information

Instructor:

Geoffrey Towell
204 Park Science Building
526-5064
gtowell at brynmawr dot edu
http://cs.brynmawr.edu/~gtowell

Lecture Hours: MW 1:10-2:30
Room: Park 336
Lab: Park 231 Mo 2:40PM - 4:00PM (Attendance in Lab is REQUIRED)
Office Hours: T 10-11AM, W 3-4pm, or by appointment. Also, if I am in my office and the door is open, you are welcome to come in.


Text

Required Text
  • Introduction to Information Retrieval by Manning, Raghavan and Schutze. Cambridge, 2008. Should be available in the campus bookstore.

 


Supporting Text (may be on reserve in Collier)
  • Managing Gigabytes by Witten, Moffat and Bell. Morgan Kaufmann, 1999.

 



Syllabus

Course Description: Information Retrieval (IR) is the process of retrieving relevant text-based information in response to a user's textual query. IR was one of the first and remains one of the most important problems in the domain of natural language processing. Web search is he application of information retrieval to the web. It is the way in which most people interact with IR systems. In this course, we will cover basic and advanced techniques for building text-based information systems, including the following topics: Efficient text indexing, Boolean and vector-space retrieval models, Evaluation and interface issues, IR techniques for the web, including crawling, link-based algorithms, and metadata usage, Document clustering and classification, Approaches to ranking retrieved texts.

Class Syllabus.


Important Dates


Assignments

Assignments may be written in any programming language. As a general rule I will not closely grade program code. I will, however, read it and expect to be able to understand what I read. Therefore, the code should be commented to the level that an independent, intelligent, and motivated person can review and understand what was done and -- potentially -- extend or fix the program. As a general rule, comments should be written at the level such that, if you picked up your own code 2 years from now you could understand what you did and how the program works.

There will be two introductory assignments that are intended to get everyone on a common footing with respect to the topic area. After that, the class will be broken into groups each of whom may get a different assignment. Each group will present their work to the class on completion of their assignment. There will be 2-3 of these group assignments.


Labs


Additional Readings


Lectures

I would prefer to have all of my lecture materials linked here. However, I may copy matrials without proper attribution. Therefore, I cannot make them web available. However, most will be available through the department servers at:
  /home/gtowell/Public380/Lectures/
  
In that directory lecture slides will be availble with obvious names.


Course Policies

Communication

Attendance and active participation are expected in every class. Participation includes asking questions, contributing answers, proposing ideas, and providing constructive comments.

As you will discover, I am a proponent of two-way communication and I welcome feedback during the semester about the course. I am available to answer questions, listen to concerns, and talk about any course-related topic (or otherwise!). Come to office hours! This helps me get to know you. You are welcome to stop by and chat.

Please stay in touch with me, particularly if you feel stuck on a topic or assignment and can't figure out how to proceed. Often a quick e-mail, or face-to-face conference can reveal solutions to problems and generate renewed creative and scholarly energy. It is essential that you begin assignments early.

Grading


At the end of the semester, final grades will be calculated as a weighted average of all grades according to the following weights. (These weights are subject to change, without notice.)

Exam 1: 20%
Exam 2: 20%
Lab Attendance: 5%
Assignments: 50%
Other: 5%
Total: 100%

Exams will be in class (or possibly take-home). If take-home then the time to complete will be no more than 2 hours. Closed book, closed notes, no electronic devices unless otherwise instructed.

Many assignments will be done in small groups (2-3) and will finish with a 10-15 minute presentation in class. The report will be a significant portion of the assignment grade. More, the portion of the grade will vary depending on the quality of the presentation. That is, an average presentation will not change the grade. An outstanding presentation could improve the project grade a lot. Conversely a poor presentation will significantly reduce the grade.

Incomplete grades will be given only for verifiable medical illness or other such dire circumstances.

ALL work submitted for grading should be entirely YOUR OWN (or that of a group if you are working in a group). Sharing of programs, code snippets, etc. is not permitted under ANY circumstances.

Submission, Late Policy, and Making Up Past Work


No assignment will be accepted after it is past due.

No past work can be "made up" after it is due.

No regrade requests will be entertained one week after the graded work is returned in class.

Exams

There will be two exams in this course.  The exams will be closed-book and closed-notes (unless otherwise instructed) .  The exams will cover material from lectures, homeworks, and assigned readings.

Study Groups

I encourage you to discuss the material and work together to understand it. Here are some thoughts on collaborating with other students:

If you have any questions as to what types of collaborations are allowed, please feel free to ask.


Links


Learning Accommodations

Students requesting accommodations in this course because of the impact of disability are encouraged to meet with me privately early in the semester with a verification letter. Students not yet approved to receive accommodations should also contact Deb Alder, Coordinator of Accessibility Services, at 610-526-7351 in Guild Hall, as soon as possible, to verify their eligibility for reasonable accommodations. Early contact will help avoid unnecessary inconvenience and delays.

This class may be recorded.


Created on January 2020. Subject to constant revision.