CS 206 -- Assignment 4

Building a sorted list

Suppose that you have a long, possibly infinite, stream of data coming to you in which the data is like the following:
3 abbeys
1 abbiamo
1 abbotcy
3 abbots
4 abbott
7 abbr
3 abbreviate
3 abbreviated
3 abbreviates
3 abbreviating
4 abbreviation
4 abbreviations
1 abdications
5 abdomen
PRINTWORDSASCENDING
3 abdomens
11 abc
4 abcs
16 abdominal
3 abduct
4 abducted
1 abducting
6 abduction
3 abductions
7 abby
4 abbyy
PRINTCOUNTS
2 abductor
2 abductors
5 abducts
PRINTWORDSDESCENDING

That is, ignoring for a moment the "PRINT..." lines, the data consists of rows and each row has two columns. The first is an integer and the second is a word. You may interpret this data as a word and the number of times the word occurs in some text. The problem is that the words occur in no particular. The integers will repeat many times, but each word occurs only once. Hence the number of integers will be far smaller than the number of words.

The problem is to read and store this data such that, whenever you see a line containing "PRINTWORDS" or "PRINTCOUNTS", you can quickly print out a information concerning all of the data you have seen.

There are three special lines in the dataset.

PRINTWORDSASCENDING
On this command, print out, in ascending order by their count, the list of words you have seen. So, in the data above, when you hit PRINTWORDSACENDING you would print:
1 abbiamo
1 abbotcy
1 abdications
3 abbeys
3 abbots
3 abbreviate
3 abbreviated
3 abbreviates
3 abbreviating
4 abbott
4 abbreviation
4 abbreviations
5 abdomen
7 abbr
PRINTCOUNTS
On the command, print the number of times each frequency has been seen. So, in the data above when you hit PRINTCOUNTS you would print
1 4
3 9
4 6
5 1
6 1
7 2
11 1
16 1
PRINTWORDSDESCENDING
On this command, print out, in descending order by their count, the list of words you have seen.

Your system must satisfy the following constraints:

To aid your debugging, there are several sample datafiles in /home/gtowell/cs206/a4/words? where '?' is a unix-style wildcard. Hand in:
Due:
      March 17, 2005