CMSC 110 (Introduction to Computing)
Spring 2016
Assignment #7
Part 1: Due by 11:59 pm on Sunday, April 24, 2016
Part 2: Due by 5pm on Friday, April 29, 2016
These are FIRM deadlines.
Task: Identify a dataset of interest to you and develop a
narrative visualization using the process outlined below. Follow the
steps of acquiring, cleaning, filtering, mining, representation,
and (optionally) interaction
to create a visual sketch of the data.
Part#1: Identify the
dataset. Acquire it, clean it, and load it into Processing.
Part#2: Develop the
narrative visualization.
Steps:
- Acquire the data set as one or more files.
- Data sources are plentiful: websites, technical articles, or collect your own
data. Below I have given some links to data repositories, but please do not restrict yourself to these.
- Find something that interests you. Find something that has a story to tell.
- Make sure the data set is not too small, that is, make sure it has some statistical significance. This means at least a few hundred data items, but preferrably larger.
- Clean up the data file so that it is readable by a computer program.
- This may mean replacing comments with special numeric codes, inserting/removing data
value delimiters, etc.
- Filter the data down to the portion that interests you.
- Remove unwanted columns, headers, footers, etc.
- Mine the data set for interesting properties.
- Find the aspect(s) of your data set that you want to highlight using your
visualization.
- Apply any statistical methods or numerical analysis that are appropriate.
- You might want to add columns that represent some combination of the original columns (average, weighted sum, etc)
- Select a visual representation that best illustrates your data set and
implement it.
- Draw from all the graphical techniques that you have learned this semester.
- Refine your visualization.
- Modify your program until it communicates your message at a glance.
- Extra Credit: Make your visualization interactive and/or animated.
- Examples include a popup/mouse-over that shows extra information when hovering over an
object, animated objects that change shape, size or color to represent data in a
time series, etc.
Websites for data sets and data source lists:
Requirements:
- Data items should be modeled with custom classes
- Final visualization must represent the dataset in some meaningful way that brings insights
- Code includes proper header and adequate comments - comment your class fields, methods and method parameters!
- Drawing must scale properly regardless of the size of the sketch.
- Sign your work.
Extra Credit (up to 20 points):
- The visualization should be interactive and/or animated
- The interaction/animation should help to tell the story you are
telling with your visualization.
What to produce:
For Part#1: (By 11:59pm
April 24)
- The program folder - this should contain the results of the beginning steps (1-4) of your visualization project. It should include
- The original uncleaned data set
- A README file (a text file named README.txt), which explains where/how you obtained the original data set (link to a website, your Chemistry class etc) and the format of the data set (meaning and type of each column, number of rows, etc)
- The data set after it has been cleaned. Explain the format of the cleaned data set in the README file as well, if you add or remove columns or change the format in any way.
- A Processing
program that loads the data into appropriate data structures, and performs any necessary computation for data mining, but does not visualize it yet. Also include a brief description about the dataset as comments at the top
of your source code file.
- A write-up with your name, course and assignment number and a brief description of the dataset, its relevance, and how you plan to visualize it. In addition, include a discussion on what type of clean-up and processing you performed in order to prepare it for visualization. Include a brief discussion about your experience working on this assignment as well. It should be a separate document and left in the sketch folder.
What to hand in:
- Submit an electronic copy of the entire sketch folder (including any images that you may have used), and the writeup via Dropbox shared folder as usual. Note that no hard-copy submission is required at this time, and no screenshots.
What to produce:
For Part#2: (By
5pm April 29)
- Make sure to name your final sketch file/folder properly so that it doesn't over write the Part#1 submission.
- The final program should include the standard header. In addition, if
you do the extra credit, write a paragraph/description that contain instructions on how to use the interactive component(s)and include it in the header of the main source
code file for your sketch.
- A write-up with your name, course and assignment number and a paragraph about the sketch, its inspiration, and how you designed and implemented it. In addition, describe the data set, its relevance and your visualization, including what you set out to achieve, how and why you chose the visual representation you did, etc in detail in your write-up.
- A gif/jpg/png image of your sketch. Place in the sketch folder
What to hand in:
- Submit a hard copy of 2, 3 and 4. This hard copy can be handed in to Tina Fasbinder in Park 357 (Tina leaves at 4:30pm), or to the plastic bin outside of my office door, if you feel comfortable leaving it there until Monday morning.
- Submit an electronic copy of the entire sketch folder (including any images that you may have used), the screenshot and the writeup via Dropbox shared folder as usual.
Hints:
- Keep it simple at first. Start with something basic, get it
working, and then build upon it piece by piece, each time ensuring it
is working before you move on to the next piece.
- START EARLY!!!