Hard Deadline: 21.06.2015 23:59 <br >
Please send your reports to mailto:leonid.e.zhukov@gmail.com and mailto:shestakoffandrey@gmail.com with message subject of the following structure:<br > [HSE Networks 2015] {LastName} {First Name} Project*{Number}*
Support your computations with figures and comments. <br > If you are using IPython Notebook you may use this file as a starting point of your report.<br > <br >
You are provided with the DBLP dataset (warning, raw data!). It contains coauthorships that were revealed during $2000$-$2014$. Particularly, the file contains $3$ colomns: first two for authors' names and the third for the year of publication. This data can be naturally mapped to undirected graph structure.
Your task is construct supervised link prediction scheme.
Consider the flickr dataset (warning, raw data!).
File ''users.txt'' provides a table of form userID, enterTimeStamp, additionalInfo...
File "contacts.txt" consists of pairs of userID's and link establishment timestamp
Recall scoring functions for link prediction. Your task is to compare the performance of each scoring function as follows:
Essentially, for this task you also have to follow the guideline points $1$ and $2$ above. The only thing you have to keep in mind is that flickr dataset is growing dataset. Since then, consider nodes that are significantly represented both in training and testing intervals (for instance, have at least $5$ adjacent edges in training and testing intervals)