Hadoop Cluster Administration: What do the 'DataNodes usages' measurements represent?

In (at least) Hadoop v2.0.0, the NameNode webpage (port 50070) now includes a "DataNodes usages" entry as part of the Cluster Summary.

For instance,

Configured Capacity : 101.05 GB
DFS Used  : 177.16 MB
Non DFS Used  : 31.68 GB
DFS Remaining  : 69.20 GB
DFS Used%  : 0.17%
DFS Remaining%  : 68.48%
Block Pool Used  : 177.16 MB
Block Pool Used% : 0.17%
DataNodes usages : Min % Median % Max % stdev %
    0.17% 0.17%  0.17% 0.00%
Live Nodes   : 3 (Decommissioned: 0)
Dead Nodes   : 0 (Decommissioned: 0)
Decommissioning Nodes  : 0
Number of Under-Replicated Blocks: 0



What do these percentages mean?



Min % = Minimum amount of storage capacity used / Total storage capacity of a DataNode.

Median % = Median amount of storage capacity used / Total storage capacity of a DataNode.

Max % = Maximum amount of storage capacity used / Total storage capacity of a DataNode.

stdev % = Standard deviation of all these DataNodes.

No comments:

Post a Comment