View allAll Photos Tagged hadoop
Each node is configured roughly like this-
- 2 CPUs
- 4 cores per CPU (so 8 cores total)
- 24 GB memory
- 4 x 2 TB hard drives
- GB Ethernet
Presented by O’Reilly and Cloudera, Strata + Hadoop World focuses on how to put big data, cutting-edge data science, and new business fundamentals to work.
The graph represents a network of up to 1000 Twitter users whose recent tweets contained "hadoop". The network was obtained on Monday, 26 March 2012 at 22:31 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The earliest tweet in the network was tweeted on Friday, 23 March 2012 at 18:55 UTC. The latest tweet in the network was tweeted on Monday, 26 March 2012 at 19:46 UTC.
The graph is directed.
The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.
The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.
The edge colors are based on relationship values. The vertex sizes are based on followers values.
Overall Graph Metrics:
Vertices: 1000
Unique Edges: 6078
Edges With Duplicates: 1006
Total Edges: 7084
Self-Loops: 886
Connected Components: 237
Single-Vertex Connected Components: 223
Maximum Vertices in a Connected Component: 747
Maximum Edges in a Connected Component: 6752
Maximum Geodesic Distance (Diameter): 9
Average Geodesic Distance: 3.249811
Graph Density: 0.00584284284284284
Modularity: 0.380253
Top 10 Vertices, Ranked by Betweenness Centrality:
@cloudera
@hackingdata
@mikeolson
@al3xandru
@bigdata
@tlipcon
@infochimps
@allcloudnews
@merv
@twitteross
Top keyword pairs by frequency of mention
V1V2WEIGHT
bigdata219
addshadoop120
mapradds100
hadoopconnectors100
movehighlights50
amazonmove49
highlightshadoop47
hadoophurdles47
cloudcomputing41
@ulitzer#cloud40
#cloud#cloudexpo40
#cloudexpo#cloudcomputing40
#cloudcomputing#bigdata40
#bigdata@cloudexpo40
@cloudexpo@bigdataexpo40
apache#hbase39
dataprocessing33
opensource32
#codemotion#es30
definitiveguide29
More NodeXL network visualizations are here: www.flickr.com/photos/marc_smith/sets/72157622437066929/ and here:
www.nodexlgraphgallery.org/Pages/Default.aspx
A gallery of NodeXL network data sets is available here: nodexlgraphgallery.org/Pages/Default.aspx?search=twitter
NodeXL is free and open and available from www.codeplex.com/nodexl
NodeXL is developed by the Social Media Research Foundation (www.smrfoundation.org) - which is dedicated to open tools, open data, and open scholarship.
Donations to support NodeXL are welcome through PayPal: www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_bu...
The book, Analyzing social media networks with NodeXL: Insights from a connected world, is available from Morgan Kaufmann and from Amazon.
Presented by O’Reilly and Cloudera, Strata + Hadoop World focuses on how to put big data, cutting-edge data science, and new business fundamentals to work.
Presented by O’Reilly and Cloudera, Strata + Hadoop World focuses on how to put big data, cutting-edge data science, and new business fundamentals to work.
In this Big Data training candidates will get a practical skill set on Hadoop in detail, along with its core and latest components, like HDFS, MapReduce, Pig, Hive, Impala HBase, Jasper, Sqoop, Flume, Oozie, Zoopkeeper, Spark and Storm. To know more, please visit: www.analytixlabs.co.in/big-data-analytics-hadoop-training...