CS 206 - Introduction to Data Structures

Assignment 8: Due 12:01 AM Thursday Nov 14

(Yes, 1 minute after midnight to avoid bad numbers.
Election day has just passed which means it is less than 1 year until the next presidential election. To celebrate this observation in this assignment we will deal with polling data for candidates in the Democratic party. (All of the data is fairly old.)

We will use this data within a Heap / priority queue.

The data files for this assignment are in /home/gtowell/Public206/a8/dem* You may copy these files as you need for development. However, your program must be able to read these files from this location (or a different location of my choosing) during testing.

Requirements:

  1. Implement an array-based binary tree that implements BinaryTreeInterface.java. /home/gtowell/Public206/a8/BinaryTreeInterface.java
  2. Now implement the PriorityQueue interface (from /home/gtowell/Public206/a8/PriorityQueueInterface.java) as an array-based Heap which extends your array-based Binary Tree. Note that it is recommended to add protected or private helper methods in both the tree and heap as you design deems necessary. Your implementation should be properly encapsulated, i.e. no implementation details should be made visible outside of the heap or tree classes.

Getting the top candidates

While a heap is usually required only to return the maximum (or minimum) element, since this will be used to store polling data, it may be interesting to us to retrieve the top few candidates.

Requirements:

  1. Add a method ArrayList<E> peekTopN(int n) that returns the top elements of the heap in order. The heap should not be any different after the method was called than it was before the method was called, i.e., this is similar to peek in that it does not remove the top element. You should not implement this method by removing and then reinserting each element, as this has the potential to modify the heap.
  2. Describe your design of the peekTopN method in your README file and give a big-O analysis.

Command Line Inputs

Your program must be able to take filenames that store polling data as arguments to your main method. Your resulting heap should contain the polling data for each candidate from the most recent date for which there is data from the files given on the command line. The resulting heap should be ordered so that the candidate with the highest percentage of voters in the most recent poll is at the top of the heap. Additionally, you will add an option for the users to provide a flag that will run your peekTopN method to determine and print out the top N candidates. The resulting arguments you should handle will look like this:
 -n 5 dempres_20190210_1.csv dempres_20190210_2.csv
In the above case, the top 5 candidates would be displayed.

Requirements:

  1. Take filename input from the command line into the main method of your Main.java. You may be given multiple filenames. Process an optional flag to print out the top N candidates. These should be printed out like this (where the example given is for N = 2):
         Top 2 Candidates:
         Joseph R. Biden Jr.:29.0
         Kamala D. Harris:15.0
    
    The above printout should happen once after all polling data has been read and inserted into your data structure.
  2. In the absence of the "-n" flag, you should program should assume that n is 1.
  3. You many assume that the list of filenames is always last, i.e. the first non-flag argument you encounter is assumed to be the beginning of the list of file names. Make sure you error-check thoroughly, both the arguments to the flags and the flags themselves. Your program should behave rationally no matter how unreasonable the input or the value of flags. (This should be fairly easy as there is only one flag.)
  4. After each polling file is inserted you should print out the heap (in order).

Extra Credit

All extra credit should only be done after successful completion of all of the base requirements for this assignment. No extra credit will be given on incomplete assignments. The number of points awarded for extra credit will be smaller than those for completion of the base requirements and the extra credit is designed to be harder than those basic requirements as well. You may choose which of the extra credit options below to pursue and can receive credit for some and not others where that makes sense. In the case that you implement ANY extra credit, you must list them in your README so that I will know to test for them and grant credit accordingly.
  1. Handle tied ranks appropriately for the peekTopN method. For example, in the case where there are two candidates who are tied for the best, peekTopN for n = 1 should print out both of those candidates.
  2. Obtain more recent polling data and format it so that matches the format of the existing data. For each polling data instance, in your readme file you must indicate where you obtained the data and from what source that data was obtained. A complete URL is necessary, but is not sufficient. You will receive credit for up to 5 polling data instances. However, credit is on a sliding scale so the fifth is worth far less than the first. Also, in case multiple people submit the same polling data, I reserve the right to divide points equally among the submitters. That is, if 1 person submits a polling data instance, it might be worth M points. If N people submit the same poll, then each person would receive M/N points.