Network Analysis - Assignment 1

Due February 13, 2013 by 11:59pm
Due February 15, 2013 by 11:59pm


You should submit a write-up of your assignment both in pdf form and in hardcopy.  The pdf is due by electronic submission before the due date and time.  The electronic submission timestamp is what counts for determining whether your assignment was submitted on-time.  You may turn the hardcopy printout in up to three days later, but it must be identical to the electronic submission.  You will also submit your code electronically.  Details on the submission process will be posted here shortly.

To submit your assignment, place a copy of your write-up (pdf preferred) and implementation into a directory called cs380hw1_lastname (substituting in your last name), then tar and gz this directory.  Name the file cs380hw1_lastname.tar.gz.  Submit it to me by attaching it to an e-mail with the subject of "CS380 Assignment 1 Lastname".  (If you've already submitted it using a different subject line or file name, please don't resend it.)  Make sure you give me a hard-copy of your write-up for the assignment.



1.)  [30 points] Bipartite Graphs             

Start by reading over Lada Adamic's Pajek tutorial (it is a bit out-dated, but will help you get started).  Pajek is installed on the lab systems.  You can also install it on your own computer (http://pajek.imfm.si/doku.php?id=download); it is runs natively on Windows and also on Linux / OS X under wine.  Pajek is widely used by sociologists, and provides a large number of network analysis methods.  However, it is not extensible.

Download the file actorsandmovies.net and open it in Pajek (File > Network > Read).  This is a 2-mode network containing two classes of nodes, actors and movies.

(a)  Create a 2-mode partition (Net > 2-Mode > Partition).  Now draw the network (Draw > Network+FirstPartition).  The two classes of nodes should be colored differently.  If labels are not shown, add them (Options>Mark Vertices Using>Labels). Apply your favorite layout algorithm via the Layout menu.  Export your layout as a 2D image (Export > 2D > ...) and include it in your write-up, labeled with the layout algorithm you used.

(b) Transform the network into a one-mode network (Net > 2-Mode > 2-Mode to 1-Mode > Rows). Draw the network (Draw > Network). Qualitatively compare the structure of the 2-Mode to the 1-Mode network. Is there a loss of information?

(c) Show the weights on each edge using (Options > Lines > Mark Lines > with Values). What do the values represent?

(d) Compute the unweighted degree of each node (Net > Vector > Centrality > Degree > All). Next draw the network using (Draw > Network+FirstVector). How is the degree represented?

(e) Add the vector value to each vertex (it will be the degree/(max possible degree)) (Options > Mark Vertices Using > Vector Values).  Who are the most important actors using this measure?

(f) How does the boundary of the network (i.e. who is included) affect who is found to be most central?

(g) Load the file actorsandmoviesWithGere.net. It contains one extra actor, Richard Gere. Repeat the above procedure. In the 1-mode network of actors, is there a change in who is most central? What does this tell you about biases and boundaries in sample selection?

(h) Remove all edges between actors who have co-starred in fewer than 3 movies (Net > Create New Network > Transform > Remove > Lines with value > Lower than). Which actors comprise the central core of this network?





2.)  [10 points] Social Network Analysis

Download and open the file dining‐table_partners.net in Pajek.  

(a)  Snowball Sampling.  Suppose that you are a newcomer to the dinner party, and know only Ella.  You'd like to meet everyone, and so your plan is to have Ella introduce you to her closest friends.  Once you know those friends, you can ask them to introduce you to their closest friends, and so on.  This is a snowball sampling technique.

Highlight the vertices that you will reach using snowball sampling (Net > Create Partition > K‐Neighbors > Output).  Who will you not meet using snowball sampling starting with Ella?


(b) Suppose each guest will share her dish (and any dish that is shared with her), with only her 1st and 2nd closest friends.  Find groups of guests who can all sample each other's dishes (Net > Partition > Components > Strong, and Draw > Network+FirstPartition). 

Which guests will get to sample no other dishes but their own?  (Net > Partitions > Degree > Input)

Display the network of strongly connected components, and include a printout in your writeup. (Operations > Network+Partition > Shrink Network > Partition, and Draw > Network+FirstPartition)




3.)  [60 points]  Graph Search

We commonly use mapping applications (Google / Yahoo maps, GPS units, etc.) to find the shortest route between two locations.  In a programming language of your choice, implement breadth-first, depth-first and Dijkstra's graph search algorithms for finding single-source shortest paths in weighted and unweighted graphs.  You should implement these algorithms by first implementing the general graph search algorithm, and then substituting in various queuing functions to create the different searches.  Each search algorithm should take as input the name of a city on the map, and search for the shortest path to a target city.

In this case, we will be using a map of Romania and searching for the shortest distance from every city to Bucharest.  The data is available in the following files:

For each of the searches, you should track and report the following:

Have your program complete the following table.  In the case where a search method fails to find a path, you should indicate the lack of a result (e.g., with "--" , NIL, or "FAILURE") in the corresponding table entry.

City name #nodes / BFS # nodes / DFS # nodes / Dijkstra's Path cost / BFS Path cost / DFS Path cost / Dijkstra's Runtime / BFS
Runtime / DFS
Runtime / Dijkstra's
Arad             


Bucharest            


...            


Vaslui            


Zerind            


You should also write a brief summary of your findings, comparing the performance of the algorithms, and discussing particular cases where one of the algorithms performs better than the others. (This summary can be anywhere from a few short paragraphs to one or two pages.) This should be a well-written analysis of your results in the form of a mini-report. Also include a printout of your code (formatted as neatly as possible) in the report and the results table of the three search functions applied to each city in Romania (as the first argument) for a path to Bucharest (the goal city), using the table format above as a guideline. To save paper, I recommend that you print your code with two logical pages per sheet (e.g., using mpage -2ol).

Hints:   (you can follow these, or choose to ignore them)



Exercises 1 and 2 were adapted from assignments by Lada Adamic, UMichigan.  Exercise 3 was adapted from assignments by Tim Finin and Marie desJardins.