Complex Networks
Number of file-id discovered in a client-side eDonkey measurement
We conduct a measurement of files available in eDonkey as follows.Our client connects to all eDonkey servers it discovers (itknows an initial lists of servers and explores the set of all serversreachable from these). Then it sends every 12 hours agiven set of keyword-based queries to all these servers. In thismeasurement, the queries were a set of general keywords and specificpaedophile keywords.
We ran this measurement for 140 days, which led to the observation of2 784 583 distinct files. Among these files, 701 857 had a paedophilekeyword in their name. The plot above displays the evolution of thenumber of observed files of each kind during the measurement.
It appears clearly that we continuously discover significant amountsof new files, even after 140 days of measurement. This may indicatethat new files continuously appear at a high rate, and/or that thenumber of files is so huge that even such measurements fail in obtaininga full list. Notice also that the large number of files with a paedophile keywordin their name is huge, raising important societal concerns.
Notice however that filenames may differ significantly from the actualcontent of files. Also, this measurement does not allow to deduce thefraction of all files having a paedophile name. Obtaining such insightis extremely challenging, and is the goal of theMeasurement and Analysis of P2P Activity Against Paedophile Content project.
Number of file-id discovered in a client-side eDonkey measurement
We conduct a measurement of files available in eDonkey as follows.Our client connects to all eDonkey servers it discovers (itknows an initial lists of servers and explores the set of all serversreachable from these). Then it sends every 12 hours agiven set of keyword-based queries to all these servers. In thismeasurement, the queries were a set of general keywords and specificpaedophile keywords.
We ran this measurement for 140 days, which led to the observation of2 784 583 distinct files. Among these files, 701 857 had a paedophilekeyword in their name. The plot above displays the evolution of thenumber of observed files of each kind during the measurement.
It appears clearly that we continuously discover significant amountsof new files, even after 140 days of measurement. This may indicatethat new files continuously appear at a high rate, and/or that thenumber of files is so huge that even such measurements fail in obtaininga full list. Notice also that the large number of files with a paedophile keywordin their name is huge, raising important societal concerns.
Notice however that filenames may differ significantly from the actualcontent of files. Also, this measurement does not allow to deduce thefraction of all files having a paedophile name. Obtaining such insightis extremely challenging, and is the goal of theMeasurement and Analysis of P2P Activity Against Paedophile Content project.