20120326-NodeXL-Twitter hadoop network
The graph represents a network of up to 1000 Twitter users whose recent tweets contained "hadoop". The network was obtained on Monday, 26 March 2012 at 22:31 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The earliest tweet in the network was tweeted on Friday, 23 March 2012 at 18:55 UTC. The latest tweet in the network was tweeted on Monday, 26 March 2012 at 19:46 UTC.
The graph is directed.
The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.
The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.
The edge colors are based on relationship values. The vertex sizes are based on followers values.
Overall Graph Metrics:
Vertices: 1000
Unique Edges: 6078
Edges With Duplicates: 1006
Total Edges: 7084
Self-Loops: 886
Connected Components: 237
Single-Vertex Connected Components: 223
Maximum Vertices in a Connected Component: 747
Maximum Edges in a Connected Component: 6752
Maximum Geodesic Distance (Diameter): 9
Average Geodesic Distance: 3.249811
Graph Density: 0.00584284284284284
Modularity: 0.380253
Top 10 Vertices, Ranked by Betweenness Centrality:
@cloudera
@hackingdata
@mikeolson
@al3xandru
@bigdata
@tlipcon
@infochimps
@allcloudnews
@merv
@twitteross
Top keyword pairs by frequency of mention
V1V2WEIGHT
bigdata219
addshadoop120
mapradds100
hadoopconnectors100
movehighlights50
amazonmove49
highlightshadoop47
hadoophurdles47
cloudcomputing41
@ulitzer#cloud40
#cloud#cloudexpo40
#cloudexpo#cloudcomputing40
#cloudcomputing#bigdata40
#bigdata@cloudexpo40
@cloudexpo@bigdataexpo40
apache#hbase39
dataprocessing33
opensource32
#codemotion#es30
definitiveguide29
More NodeXL network visualizations are here: www.flickr.com/photos/marc_smith/sets/72157622437066929/ and here:
www.nodexlgraphgallery.org/Pages/Default.aspx
A gallery of NodeXL network data sets is available here: nodexlgraphgallery.org/Pages/Default.aspx?search=twitter
NodeXL is free and open and available from www.codeplex.com/nodexl
NodeXL is developed by the Social Media Research Foundation (www.smrfoundation.org) - which is dedicated to open tools, open data, and open scholarship.
Donations to support NodeXL are welcome through PayPal: www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_bu...
The book, Analyzing social media networks with NodeXL: Insights from a connected world, is available from Morgan Kaufmann and from Amazon.
20120326-NodeXL-Twitter hadoop network
The graph represents a network of up to 1000 Twitter users whose recent tweets contained "hadoop". The network was obtained on Monday, 26 March 2012 at 22:31 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The earliest tweet in the network was tweeted on Friday, 23 March 2012 at 18:55 UTC. The latest tweet in the network was tweeted on Monday, 26 March 2012 at 19:46 UTC.
The graph is directed.
The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.
The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.
The edge colors are based on relationship values. The vertex sizes are based on followers values.
Overall Graph Metrics:
Vertices: 1000
Unique Edges: 6078
Edges With Duplicates: 1006
Total Edges: 7084
Self-Loops: 886
Connected Components: 237
Single-Vertex Connected Components: 223
Maximum Vertices in a Connected Component: 747
Maximum Edges in a Connected Component: 6752
Maximum Geodesic Distance (Diameter): 9
Average Geodesic Distance: 3.249811
Graph Density: 0.00584284284284284
Modularity: 0.380253
Top 10 Vertices, Ranked by Betweenness Centrality:
@cloudera
@hackingdata
@mikeolson
@al3xandru
@bigdata
@tlipcon
@infochimps
@allcloudnews
@merv
@twitteross
Top keyword pairs by frequency of mention
V1V2WEIGHT
bigdata219
addshadoop120
mapradds100
hadoopconnectors100
movehighlights50
amazonmove49
highlightshadoop47
hadoophurdles47
cloudcomputing41
@ulitzer#cloud40
#cloud#cloudexpo40
#cloudexpo#cloudcomputing40
#cloudcomputing#bigdata40
#bigdata@cloudexpo40
@cloudexpo@bigdataexpo40
apache#hbase39
dataprocessing33
opensource32
#codemotion#es30
definitiveguide29
More NodeXL network visualizations are here: www.flickr.com/photos/marc_smith/sets/72157622437066929/ and here:
www.nodexlgraphgallery.org/Pages/Default.aspx
A gallery of NodeXL network data sets is available here: nodexlgraphgallery.org/Pages/Default.aspx?search=twitter
NodeXL is free and open and available from www.codeplex.com/nodexl
NodeXL is developed by the Social Media Research Foundation (www.smrfoundation.org) - which is dedicated to open tools, open data, and open scholarship.
Donations to support NodeXL are welcome through PayPal: www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_bu...
The book, Analyzing social media networks with NodeXL: Insights from a connected world, is available from Morgan Kaufmann and from Amazon.