CS 206: Data Structures
Assignment#5

Due on Thursday, March 4, 2010

 

The file zip.txt contains data about towns and counties in the United States. It is a bit outdated, but will suffice for our purposes. Each line of the file contains the following information:

Zip Code,Town Name, State, Phone Area Code, Dummy, County, Time Zone, Dummy, Dummy, County Population

where the fields labelled, Dummy, is information we will not be using in this exercise. Shown below, is an example line from the file:

02138,Cambridge      ,MA,617,0472130681980,Middlesex     ,Eastern ,40 ,2.63 ,1367000

From the above you can see that the line contains information about Cambridge, MA. Its zip code is 02142, It is in Middlesex county whose population is 1367000. It is in the Eastern time zone and has the phone area code (617). The data file is a Zip Code database file. Cambridge, MA has seven Zip Codes in it and thus there are seven entries in the data file (one for each Zip Code) for Cambridge, MA.

  1. Write a program that reads all the data and stores it in a list (define appropriate data abstractions to store each entry). After reading the data file, your program should print out the number of entries in the list. Only include the information described above, ignoring the Dummy fields.
  2. Write (or extend the privious) program that, given a town name and its state, searches the list for the name of the town (the first occurrence only) and state, and then prints out all the information it knows about it, in the following format:
   Query: Cambridge, MA
   Found...
   Town: Cambridge, MA, 02138 (Area Code: 617, Time Zone: Eastern)
   County: Middlesex, population 1367000

A user may misspell the name of a town or enter the name of a town that does not exist (i.e. not contained in your database). In that case, just print out a message to that effect. You should be careful about the case-sensitivity of the town/state names. One way to avoid any problems is to convert the strings being compared to all upper or lower case.

Note that you are only printing out the first occurrence of the town querried. Depending upon the size of the town, it may have several zip codes (and hence entries).

For demonstrating the correctness of your program, include a printout that shows the output from following queries:

  1. Cambridge, MA
  2. Jim Thorpe , PA
  3. Your birth place (if in the US)
  4. Bryn Mawr, PA
  5. Boring, OR
  6. Hell, MI
  7. Walla Walla, WA
  8. Surprise, NY
  9. Joe, MT
  10. Romance, AR
  11. Mars, PA
  12. Nuttsville, VA
  13. Rough and Ready, PA
  14. Dynamite, WA
  15. Good Grief, ID

After obtaining all the answers, compute the average number of comparions needed to answer a query, the average number of comparisions needed to answer a successful query, the average number of comparisons needed to answer an unsuccessful query.

In addition to the output, hand in a prinout of the complete Pyhton program.

Work on your program one piece at a time. Do part #1 first. Start on Part#2 (to answer queries) only after completing Part#1. And then add the statements to count the number of comparisions.

Technical Hints:

1. The data on each line is separated by a comma (','). The split function for strings can split a line of text using any character as a seperator. .

2. You will also need to make use of the rstrip funtion of the string module to remove trailing blank characters in a string. See the Python documentation for usage.

What to Hand-in:

  1. A complete printout of the latest version of your program(s)
  2. a prinout of youyr program showing the outputs from the queries above (as well as some failed queries)

Back to CS206 Course Materials