Journal for Computer Science

Setup :

Hadoop 1.2.1

2 Virtual Machines (VM) running Fedora 19 - KDE Live

Host-Only Networking (Works on Bridged as well)

VirtualBox 4.3.4

Host Machine - Windows 8

Problem #1 - No Route to Host

Error Message :
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to master/192.168.56.101:54310 failed on local exception: java.net.NoRouteToHostException: No route to host

Initial Review of the Situation :

SSH : Bidirectional

At this point, I added the slaves ssh keys to the Masters. So it would ssh both directions without a password.

etc/hosts : Correct

(hduser@master) & (hduser@slave)

192.168.56.101 master
192.168.56.102 slave

Ping : Bidirectional

hduser@master $ ping slave # good
hduser@slave $ ping master # good

$HADOOP_HOME/conf settings : Correct

core-site.xml (Same on both machines)

(hduser@master) <value>hdfs://master:54310</value> # Good
(hduser@slave) <value>hdfs://master:54310</value> #Good

mapred-site.xml

(hduser@master) <value>master:54311</value> #Good
(hduser@slave) <value>master:54311</value> #Good

Suggestions Online :

There were many suggestions online to resolve this issue, but to save space I have not included them here.

Hadoop cluster setup : Firewall issues

Check Firewall configurations

Turn off iptables

# service iptables save
# service iptables stop
# chkconfig iptables off

Error on starting HDFS daemons on hadoop Multinode cluster

Confirm NameNode is running fine.
Try Telnet to the IP and Port
IPTables are misconfigured
DNS has misconfigured IP addresses

Resolution Process :

My plan for resolving this issue was to try as many of the non-intrusive ideas first, and then gradually work towards the extreme.

Telnet

Received No Route Error!

Disable/Configure Firewall - Resolution

Disabled IPTables - Nothing Changed

Looked for other firewall software.

Firewalld, a Fedora project service, was running!
Disabled Firewalld

Connected!

Conclusion :

As many people have already suggested in other forums, the issue was not a problem with Hadoop, but a network configuration problem. I would highly suggest the next person to review their firewall configurations, and as crazy as it sounds, try to find out if any unknown firewall software is installed on the system.

Problem #2 - Incompatible NamespaceIDs

Error Message :
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 1550889903; datanode namespaceID = 8951322

Initial Review of the Situation :

Format Namenode

I formated both the slave and the master NameNode in hopes to resolve this issue. Both were properly Formatted, but he problem persisted.
$HADOOP_HOME/bin/hadoop namenode -format

Suggestions Online :

The suggested fix for this problem was posted on one of the main tutorials for setting up Hadoop, primarily the one I was following.

Running Hadoop on Ubuntu Linux (Multi-Node Cluster)

Start from scratch (Only if necessary)
Update the NamespaceIDs to match.

$dfs.data.dir/current/VERSION

Resolution Process :

My plan for resolving this issue was to try as many of the non-intrusive ideas first, and then gradually work towards the extreme.

Start from Scratch

I followed the instructions, and remove the data node information from /app/tmp/hadoop/dfs/...
Re-Formated the namenode
Started Hadoop again
Fail!

Update the NamespaceIDs

Initially I tried updating the Namenode (dfs.name.dir/current/VERSION) to match the Datanode Namespace.

Fail, Namenode reverted back to its original NamespaceID.

Attempted to fix the Datanode NamespaceID to match the Namenode NamespaceID

Also updated the info on the slave node(s)

Success!

Conclusion :

I am unsure how they went out of sync, but it worked.

Problem #3 - Unregistered Datanode Exception

Error Message :
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 10.0.0.14:50010 is attempting to report storage ID DS-720171542-192.168.56.102-50010-1388219833238. Node 10.0.0.16:50010 is expected to serve this storage.

Cause : Attempted to expand my cluster size to 3 VM Nodes, via cloning the original slave VM. As a full clone, it already contained the Master SSH key, etc. The only issue was that the Master was confused about them both having the same Datanode ID.

Initial Review/Research

Hadoop: How do datanodes register with the namenode?

From this posting it is apparent that the Datanodes are registered with the Namenode. Its more than likely by cloning the first VM, I brought over all of the registration information, causing the Namenode to believe both are the same datanode.

Resolution of the ERROR :

Format Namenode

I formated both the slave and the master NameNode in hopes to resolve this issue. Both were properly Formatted, but he problem persisted.
$HADOOP_HOME/bin/hadoop namenode -format

Start from scratch (See Problem #2 above)

Stopped Name/Datanodes
Removed all the Datanode/Namenode/Secondary folders from the master and slaves.
formatted the namenode (hadoop namenode -format)
started the cluster again (start-dfs.sh)

Success!

Conclusion :

This has been the second time removing the folders/files from the dfs.name/data.dir area has successfully resolved my problem.

Journal for Computer Science

Friday, December 27, 2013

Errors Setting up Hadoop

Setup :

Problem #1 - No Route to Host

Initial Review of the Situation :

Suggestions Online :

Resolution Process :

Conclusion :

Problem #2 - Incompatible NamespaceIDs

Initial Review of the Situation :

Suggestions Online :

Resolution Process :

Conclusion :

Problem #3 - Unregistered Datanode Exception

Initial Review/Research

Resolution of the ERROR :

Conclusion :