Decommissioning is the process of gracefully removing a DataNode from a Hadoop cluster, ensuring data integrity by replicating its stored data to other nodes before shutting it down. This feature helps maintain high availability and fault tolerance, particularly during hardware maintenance or scaling down the cluster.
- Upgrades without risking data loss.
- Cluster Scaling: Remove nodes when downsizing or reorganizing cluster resources.
- Fault Management: Decommission malfunctioning nodes to protect data and prevent system interruptions.
- Decommissioning allows data to be replicated to other nodes before the DataNode is stopped, ensuring continuity of data availability and reducing the risk of data loss.
Add the hostname or IP of the DataNode to the dfs.exclude file, typically configured in hdfs-site.xml.
Run the following command on the NameNode to recognize the excluded node:
hdfs dfsadmin -refreshNodesThe DataNode will stop accepting new data, and existing data will be replicated to other DataNodes. Monitor this process on the Hadoop Web UI (Decommissioning Nodes section).
Once decommissioning is complete, stop the DataNode service on the node:
hadoop-daemon.sh stop datanodeAfter decommissioning, remove the node from dfs.exclude if it's no longer needed.
Using these steps helps maintain Hadoop cluster stability and ensures data availability through replication during node removal and addition.
ย ย
-
Identify the DataNode: Determine the hostname or IP address of the DataNode you want to delete. You can check the DataNode status using:
hdfs dfsadmin -report
-
Safely Decommission the DataNode: To safely remove a DataNode, add it to the decommission list in the
hdfs-site.xmlconfiguration file on the NameNode.nano /etc/hadoop/hdfs-site.xml
Add the following configuration (if it doesn't already exist):
<property> <name>dfs.hosts.exclude</name> <value>/etc/hadoop/dfs.hosts.exclude</value> </property>
-
Create the Exclude File (on the manager node): Create or edit the exclude file (e.g.,
/etc/hadoop/dfs.hosts.exclude):nano /etc/hadoop/dfs.hosts.exclude
Add the hostname or IP address of the DataNode you want to delete (e.g.,
node2). -
Restart the NameNode: After updating the configuration, restart the NameNode to apply the changes:
hadoop-daemon.sh restart namenode or hdfs --daemon start datanode
-
Verify Decommissioning: Check the status of the DataNodes to ensure the DataNode is decommissioned:
hdfs dfsadmin -report
-
Stop the DataNode: On the DataNode you want to remove, stop the DataNode service:
hadoop-daemon.sh stop datanode
-
Remove DataNode from Cluster: Optionally, you can uninstall or remove the Hadoop directory configuration from the DataNode server if it will no longer be part of the cluster.
ย ย
- First just remove the hostname or IP address of the DataNode from this file
nano /etc/hadoop/dfs.hosts.exclude
-
Restart the NameNode: After removing , restart the NameNode to apply the changes:
hadoop-daemon.sh restart namenode or hdfs --daemon start datanode
-
Verify Commissioning: Check the status of the DataNodes to ensure the DataNode is decommissioned:
hdfs dfsadmin -report
๐จโ๐ป ๐๐ป๐ช๐ฏ๐ฝ๐ฎ๐ญ ๐ซ๐: Suraj Kumar Choudhary | ๐ฉ ๐๐ฎ๐ฎ๐ต ๐ฏ๐ป๐ฎ๐ฎ ๐ฝ๐ธ ๐๐ ๐ฏ๐ธ๐ป ๐ช๐ท๐ ๐ฑ๐ฎ๐ต๐น: csuraj982@gmail.com