Bryn Mawr College
CS 325: Computational Linguistics - Fall 2024
Assignment#5
Due before class on Monday, December 9

Tutorial on Parsing in NLTK: Click here.

Description: Write a CFG that is able to define (and accept) the following set of sentences. The domain is simple sentences about animals and their properties. The sentences are in two classes: declarative and WH-type. Each sentence represents either a fact or a query.

1. A fish is an animal.
2. A bird is an animal.
3. A fish has gills.
4. A fish swims.
5. A bird flies.
Q1. What is a fish?
Q2. Is a fish an animal?
Q3. Does a fish swim?
Q4. Does a bird have gills?
Q5. Does a bird fly?

Other sentences in this domain are listed here.

Next, using a parsing scheme of your choice that is already implemented in NLTK (see tutorial posted above), instantiate a parser that is able to parse all of these sentences.

Notes

  1. You will need to tokenize. It is not necessary for this exercise to use a tagger to find out the pos-tag for each word in a sentence. Instead, you can define productions like,

    non-terminal -> terminal
    Noun -> animal | fish | ...

  2. Work incrementally to accomplish the task.
  3. Write the grammar first and then do the work on the computer after you are convinced that your grammar is correct.
  4. Show a sample run of your program showing the result of parsing all the sentences shown above (in blue). Create three additional sentences that your grammar accepts and show their output. Show at least three sentences rejected by the grammar. Does your grammar process/recognize all of the sentences in the extended test suite given? (Yes/No). No need to include output. Also, write a short final section on your own reflections on the exercise, the process, and how you arrived at the solution(s).

Back to CS325 home page.