CS 206: Data Structures
Exercise#15
Due on Thursday, April 10
The file zip.txt contains data about towns and counties in the United States. It is a bit outdated, but will suffice for our purposes. Each line of the file contains the following information:
zip code,Town Name, State, Phone Area Code, Dummy, County, Time Zone, Dummy, Dummy, County Population
where the fields labelled, Dummy, is information we will not be using in this exercise. Shown below, is an example line from the file:
02138,Cambridge ,MA,617,0472130681980,Middlesex ,Eastern ,40 ,2.63 ,1367000
From the above you can see that the line contains information about Cambridge, MA. Its zip code is 02142, It is in Middlesex county whose population is 1367000. It is in the Eastern time zone and has the phone area code (617). In fact, the data file is a Zip Code file Cambridge, MA has seven Zip Code in it and thus there are seven entries in the data file.
Query: Cambridge, MA Found... Town: Cambridge, MA, 02138 (Area Code: 617, Time Zone: Eastern) County: Middlesex, population 1367000
In addition, the program should print out the number of comparisons that had to be made in order to find the town listed. Of course, a user may misspell the name of a town or enter the name of a town that does not exist (i.e. contained in your database). In that case, just print out a message to that effect (still listing the total number of comparisons needed). While comparing, you should be careful about the case-sensitivity of the town/state names. One way to avoid any problems is to convert the strings being compared to all upper or lower case.
Note that you are only printing out the first occurrence of the town querried. Depending upon the size of the town, it may have several zip codes (and hence entries).
For demonstrating the correctness of your program, include a printout that shows the output from following queries:
After obtaining all the answers, compute the average number of comparions needed to answer a query, the average number of comparisions needed to answer a successful query, the average number of comparisons needed to answer an unsuccessful query.
In addition to the output, hand in a prinout of the complete Java program. Your program will be graded based on its correctness as well as design.
Work on your program one piece at a time. Do part #1 first. Start on Part#2 (to answer queries) only after completing Part#1. And then add the statements to count the number of comparisions.
Technical Hints:
1. As opposed to the Weather Data file you used in the last few assignments, the data on each line is separated by a comma (','). The StringTokenizer class can be used to separate tokens based on a provided token separation character. That is:
StringTokenizer tokens = new StringTokenizer(some_String);
will separate tokens by assuming each token is separated by a space (' '). You can, instead issue the command:
StringTokenizer tokens = new StringTokenizer(some_String, ",");
to separate tokens using a comma as a separating character.
2. You will also need to make use of the trim method of the String class. Given a string:
String s1 = " Bryn Mawr "; s1 = s1.trim();
will trim the spaces from the string. Thus s1 will now have the value:
s1 = "Bryn Mawr"