Bryn Mawr College
CS 325: Computational Linguistics - Fall 2024
Assignment#4
Due before class on Wednesday, November 6

Description:

First, if you have not yet done it, do the Stochastic Tagging Lab.

Part 1. Using the tagged Brown Corpus in NLTK do the following:

Part 2. Using the tagging methods in NLTK presented in class:

Notes

  1. You can use the nltk tokenizers, if needed. For stochastic taggers it is a good idea to tag one sentence at a time (i.e. sentence boundaries are treated as new contexts.). You may want to use a combination of hand/program-based tokenization, if necessary (especially for small test texts provided). Run your program on the texts provided.
  2. Work incrementally to accomplish the task.
  3. Try and document your thought process at each step.
  4. Once done, summarize the process by which you arrived at the final solution in the Report section of your Colab Notebook.
  5. The Summary section should also contain the outcome of your analyses of outputs as specified above. Finally, conclude the section with your own reflections on the exercise, the process, and how you arrived at the solution(s).

WHAT TO HAND IN

Once completed, send/share the link to your Notebook with the instructor via e-mail. To do this, click on the "Share" icon/button (see top right of window), in the pop-up window, change the access to "Anyone with link", copy the link and paste into the e-mail.

Back to CS325 home page.