CS 246 Homework 3:  Files and Classes

You MUST work with one other person on this assignment.

Program Design Due:  In hard copy by Friday, February 29, 2013
Full Assignment Due:  Wednesday, March 6, 2013
 Thursday, March 7, 2013
 
Please read the homework guidelines before you proceed.

The purpose of this assignment is to get you comfortable with: (1) creating simple object-oriented data structures using classes, and (2) reading data from files.


Project Description

Write a C++ program that reads and analyzes the frequency of names from files from the U.S. Social Security Administration.  Your program should use proper object-oriented design throughout the program.


Input

Your program will read the data from files where each line is in the following format:

rank Male_name Male_number Female_name Female_number

where,

rank              The ranking of the names on this line
Male_name         A male name of this rank
Male_number       Number of males with this name
Female_name       A female name of this rank
Female_number     Number of females with this name

This is the format of database files obtained from the U.S. Social Security Administration of the top 1000 registered baby names. Each line begins with the rank, followed by the male name at that rank, followed by the number of males with that name, etc. Here is an example file containing data from the year 2002:

1 Jacob 30122 Emily 24262
2 Michael 28119 Madison 21546
3 Joshua 25859 Hannah 18559
4 Matthew 24831 Emma 16324
5 Ethan 21949 Alexis 15411
6 Joseph 21766 Ashley 15217
7 Andrew 21696 Abigail 15155
8 Christopher 21676 Sarah 14564
9 Daniel 21186 Samantha 14540
10 Nicholas 21148 Olivia 14481
...
996 Edgardo 158 Jazmyne 222
997 Garett 158 Libby 222
998 Gerard 158 Nyasia 222
999 Ryley 158 Kari 221
1000 Braulio 157 Keeley 221

As you can see from the above, in 2002, there were 30,122 male babies named Jacob and 24,262 babies named Emily, making them the most popular names used in that year. Similarly, going down the list, we see that there were 221 newborn females named, Kari, making it the 999th most popular name.

Output

Your program should output the number of times the following names appear in each database file, as well as in total: George (male and female), Jennifer (female), Mary (female), Mercedes (female), Precious (female),  Robert (male) and YOUR_NAME. There are three stats you need to obtain for each name: the rank, the number of time used, and the percentage.

Run the program on all the files in the Names directory, which can be found in /home/eeaton/public/cs246/babynames/ and fill out the following table (an example entry is filled in), the filled entry is showing that of names given to babies in 1900, 5.6205% of total female babies totalling 24,455 were named Mary.

 

 
1900
1910
...
Total
Name Sex
rank   number   % rank   number   %   rank   number   %
George female




George male




Jennifer
female

     
Mary
female
1    24455    5.6205


Mercedes
female
       
Precious
female
       
Robert
male
       
_YOUR_NAME_
       

 


Command Line Arguments

Your program must take the following command line arguments:

-m <listOfMaleNames> -f <listOfFemaleNames> filename1 filename2 filename3 ...
The first four arguments (-m <listOfMaleNames> -f <listOfFemaleNames>) are optional, and can be omitted.  These arguments will tell your program the names for which statistics should be output, and should be separated by commas.  For example,
-m George,Robert -f George,Jennifer,Mary,Mercedes,Precious,YOUR_NAME names1900 names1910 names1920 ...
would output the table above.  Your program should also work for only male or only female names (omitting the other arguments), and should handle these arguments in reverse order.  For example,
-f George,Jennifer,Mary,Mercedes,Precious,YOUR_NAME names1900 names1910 names1920 ...
-f George,Jennifer,Mary,Mercedes,Precious,YOUR_NAME -m George,Robert names1900 names1910 names1920 ...
would work fine.


Requirements

Your program must accept the command line arguments as described above.


Hints


Program Design

Submit a brief program design, showing the header files for each data structure you will use in your assignment.  Provide the function prototypes for all functions and class methods you will implement.  Also provide a fleshed-out main() function.


Final Program

Submit one assignment for all partners, following the assignment guidelines and submission instructions. 

Be sure to list all partners' names in your README file.  Also in the README file, discusses how and why your final implementation differs from your program design, if at all.

Provide a working Makefile for your assignment.  The Makefile should support three commands:  make, make run, and make clean.  I must be able to:

  1. Uncompress your submission
  2. "cd" into the directory
  3. Type "make" and have your program compile itself.
  4. Type "make run" and have it output the table above.

Put all source code, the data files, the README, and Makefile into a single .tar.gz file.

Additionally, each partner should also submit an independent and private brief statement via e-mail to Eric that describes each partner's contribution to the project.  Use the subject "CS246 HW3 Partner Work Distribution".


Grading

Your program be graded on modularity, organization, and efficiency, in addition to functionality.