CS 206 - Introduction to Data Structures
Assignment 11: Due 11:59PM Thursday April 30, 11:59pm
Using Hashtables count words
In a collection of Dickens novels there are 82136 unique words. This assignment is to write a java program to do the following:
- Confirm that that there are indeed 82136 unique words
- Compare the time required to determine this number between two techniques
- Keep up-to-date word counts in an unordered ArrayList
- Keep up-to-date word counts in a hashtable (using the Java HashMap class)
- When doing the time comparisons, you might also consider how long the BookReader class requires to simply read the collection since this time is a constant overhead on both systems.
- Count the frequency of appearance of each unique word in the collection. To check your work, here are some sample counts
the 144171
oliver 747
david 92
like 4199
and 101390
From these counts you could correctly conclude that the collection includes Oliver Twist but not David Copperfield.
- Print a list of the top Q most common words in the collection along with their frequency where Q is some number between 1 and 82136. (Work on this requirement only after completing all other requirements.
Get the files BookReader.java and dickens.txt. (These are also available at /home/gtowell/Public206/a11.) The latter is the collection of several of Dickens' works. I claim that this file has 82136 unique words. The former is code to read the file dickens.txt and give you one word at a time (the "nextWord()" method).
The particulars of your implementation are entirely at your discretion other than that you must use my BookReader class to read the text file and you must conform with the requirements listed above. In addition, age old requirements about comments and exception handling still apply. Similarly, everything should be properly encapsulated.
Electronic Submissions
Your program will be graded based on how it runs on the
department’s Linux server, not how it runs on your computer.
The submission should include the following items:
- README: This file should follow the format of this sample README (https://cs.brynmawr.edu/cs206/README.txt)
- Within the README file, include a brief (max 1 paragraph) reflection on what went well and/or poorly and why
- Also within
- Source files: Every .java file used in the final version of every part of your project (including the imported files)
- Unique Data files used: If any
The following steps for submission assume you are using VSC, and that you created a project named AssignmentN in the directory /home/YOU/cs206/
- For this assignment N=11
- Put the README file into the project directory
- Go to the directory /home/YOU/cs206
- Enter submit -c 206 -p N -d AssignmentN
For more on using the submit script click here