Network Analysis - Assignment 2

Due February 27, 2013 by 11:59pm

You should submit a write-up of your assignment both in pdf form and in hardcopy.  The pdf is due by electronic submission before the due date and time.  The electronic submission timestamp is what counts for determining whether your assignment was submitted on-time.  You may turn the hardcopy printout in up to three days later, but it must be identical to the electronic submission.  You will also submit your code electronically.

To submit your assignment, place a copy of your write-up (pdf preferred) and implementation into a directory called cs380hw2_lastname (substituting in your last name), then tar and gz this directory.  Name the file cs380hw2_lastname.tar.gz.  Submit it to me by attaching it to an e-mail with the subject of "CS380 Assignment 2 Lastname".  Make sure you give me a hard-copy of your write-up for the assignment.

1.)  [10 points] Exercise 6.5 from Newman            

2.)  [16 points total] Network Centrality

Answer each of these question based on the networks above.  Show your work for partial credit and put a box around your answer.
a.) [4 pts] What is the diameter of the left network?  The right network?
b.) [4 pts] What is the normalized degree centrality of vertices C and F in the left network?
c.) [4 pts] What is the normalized betweenness centrality of vertices C and F in the left network?
d.) [4 pts] Which network has larger centralization?  (left or right)

3.)  [14 points]  Small World Networks

a.)  [6 pts] Which of the following graphs has the highest clustering coefficient? (You don’t need to actually compute it!)  Justify your answer.
small world networks
b.) [8 pts] We introduced two variants of the Watts/Strogatz small world model (edge rewiring and edge addition).  Which model leads to the larger clustering coefficient? The larger average path length?  Justify your answers.

4.)  [60 points]  Network Centrality and Degree Distribution

In this problem you will be identifying important vertices in unknown networks using PageRank and Centrality.

I've provided the adjacency matrices for three graphs (dolphins.adj, imdb_prodco.adj, politicalblogs.adj).  The format for the adjacency matrices is as follows:
    #rows, #cols
    source1, target1, edgeweight1
    source2, target2, edgeweight2
    source3, target3, edgeweight3
The first line of the file specifies the dimensionality (rows and columns) of the matrix (i.e., number of vertices).  Each subsequent line specifies the source vertex ID, the target vertex ID, and the edge weight.  The vertices are identified by their number.

Analyze each of the graphs, providing the following information:

You will need to implement methods for computing the PageRank, normalized degree centrality, and centralization.  You may use a programming language of your choice, but your implementation must read in the data from the .adj files and output the following information (at a minimum): the in-degree and out-degree of each vertex, the PageRank of each vertex, the normalized degree centrality of each vertex, and the centralization of the entire graph.  Given this output, you can use any program you like to compute the histogram (Matlab, Gnuplot, your own program, etc.).  Your implementation can output the values for a histogram plot as well.

How does the in-degree, out-degree, PageRank and normalized degree centrality of a vertex compare to each other?

Extra Credit [up to 15pts]

Compute the normalized betweenness centrality of the vertices as well, and provide:

You are welcome to use your implementation of Dijkstra's algorithm from assignment 1 to compute the geodesic path between various pairs of vertices.  If you want a challenge, you could also implement the Floyd-Warshall algorithm to compute the shortest paths between all pairs of vertices in the network.

Exercises 2 and 3 were adapted from assignments by Qiaozhu Mei, UMichigan.