Hadoop notes

 

Following the basic tutorial here on staruser01 under a non-root account:

http://hadoop.apache.org/common/docs/current/single_node_setup.html

 

$ wget http://apache.opensourceresources.org//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz

$ tar xvzf hadoop-0.20.203.0rc1.tar.gz

$ cd hadoop-0.20.203.0

Added "export JAVA_HOME=/usr/lib/jvm/java-1.6.0-sun" to conf/hadoop-env.sh

Now test the standalone (non-distributed, single thread) mode of operation:

$ mkdir input
$ cp conf/*.xml input   #(note, there are 6 such files)
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

This produces about 140 lines of output (saved in "simple_example_output"), which is laregly incomprehensible to me at this point.

$ cat output/*

1       dfsadmin
 

Interpretation:  This searched the xml files for instances of the string 'dfs' and summed up the number of each matching instance.  In this case, only 'dfsadmin' was found, and it was only found once.  (not a terribly impressive demo...)

Should get a few more results with this:

$ bin/hadoop jar hadoop-examples-*.jar grep input output2 'the[a-z.]+'

$ cat output2/*
4       they
3       then
2       there
1       them.
1       ther
 

In this example, the "map" portion is the task of grepping for the string and storing counts of each unique instance found.  There are a total of 6 tasks (one for each file in the input directory).  The "reduce" portion is combining the search results of the separate tasks.  Presumably if a full-blown Hadoop cluster with HDFS existed, the map tasks would execute in parallel on separate nodes, with the specific tasks assigned to individual nodes that have each input file as distributed by HDFS.  I don't know if the reduction would take place as each task completes, or if all tasks have to complete before reduction can be done (nor do I know if the reduction is done on individual cluster nodes, or on the Job Tracker node (which would seem a bad idea)), nor how the output files are distributed (if at all) in HDFS.

 

Now to try the pseudo-distributed example:

edited conf/core-site.xml, conf/hdfs-site.xml and conf/mapred-site.xml files in accord with the instructions found at the link.  (Original files are saved as *.original.)

Now there is a bit of a sticking point for future deployments - the need for passphraseless ssh between nodes.  Of course, I don't have multiple nodes at this point, and as long as I have my connection to staruser01 open with my agent running, this is satisfied, but with multiple nodes, I will have to use passphraseless SSH keys apparently.

$ bin/hadoop namenode -format
11/08/09 17:23:47 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = staruser01.star.bnl.gov/130.199.162.170
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.203.0
STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011
************************************************************/
11/08/09 17:23:47 INFO util.GSet: VM type       = 32-bit
11/08/09 17:23:47 INFO util.GSet: 2% max memory = 19.33375 MB
11/08/09 17:23:47 INFO util.GSet: capacity      = 2^22 = 4194304 entries
11/08/09 17:23:47 INFO util.GSet: recommended=4194304, actual=4194304
11/08/09 17:23:47 INFO namenode.FSNamesystem: fsOwner=wbetts
11/08/09 17:23:47 INFO namenode.FSNamesystem: supergroup=supergroup
11/08/09 17:23:47 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/08/09 17:23:47 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/08/09 17:23:47 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/08/09 17:23:47 INFO namenode.NameNode: Caching file names occuring more than 10 times
11/08/09 17:23:48 INFO common.Storage: Image file of size 112 saved in 0 seconds.
11/08/09 17:23:48 INFO common.Storage: Storage directory /tmp/hadoop-wbetts/dfs/name has been successfully formatted.
11/08/09 17:23:48 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at staruser01.star.bnl.gov/130.199.162.170
************************************************************/
 

No reason to believe it didn't work.  The "filesystem" is found at /tmp/hadoop-wbetts/dfs/name, in which there are two directories, "current" and "image":

$ ls -l /tmp/hadoop-wbetts/dfs/name/*
/tmp/hadoop-wbetts/dfs/name/current:
total 32
-rw-r--r-- 1 wbetts rhstar   4 Aug  9 17:23 edits
-rw-r--r-- 1 wbetts rhstar 112 Aug  9 17:23 fsimage
-rw-r--r-- 1 wbetts rhstar   8 Aug  9 17:23 fstime
-rw-r--r-- 1 wbetts rhstar 101 Aug  9 17:23 VERSION

/tmp/hadoop-wbetts/dfs/name/image:
total 8
-rw-r--r-- 1 wbetts rhstar 157 Aug  9 17:23 fsimage
 

And now the magic happens:

$ bin/start-all.sh
starting namenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-namenode-staruser01.star.bnl.gov.out
<banner removed>
localhost: starting datanode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-datanode-staruser01.star.bnl.gov.out
<banner removed>
localhost: starting secondarynamenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-secondarynamenode-staruser01.star.bnl.gov.out
starting jobtracker, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-jobtracker-staruser01.star.bnl.gov.out
<banner removed>
localhost: starting tasktracker, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-tasktracker-staruser01.star.bnl.gov.out
 

This has started several java threads:

$ jps  [how did I not know this command before today?!?!]
20146 Jps
19687 DataNode
19587 NameNode
19794 SecondaryNameNode
 

http://localhost:50070/ redirects to http://localhost:50070/dfshealth.jsp which gives a overview of the filesystem status.

However, the JobTracker is not running (and correspondingly, http://localhost:50030/ yields "Unable to connect").  The jobtracker log file (logs/hadoop-wbetts-jobtracker-staruser01.star.bnl.gov.log) says this:

$ more logs/hadoop-wbetts-jobtracker-staruser01.star.bnl.gov.log
2011-08-09 17:28:28,740 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host = staruser01.star.bnl.gov/130.199.162.170
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.203.0
STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; comp
iled by 'oom' on Wed May  4 07:57:50 PDT 2011
************************************************************/
2011-08-09 17:28:29,016 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.prop
erties
2011-08-09 17:28:29,059 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=S
tats registered.
2011-08-09 17:28:29,064 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(
s).
2011-08-09 17:28:29,064 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: JobTracker metrics system started
2011-08-09 17:28:29,657 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.RuntimeException: Not a host:port pair: lo
cal
        at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:138)
        at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:125)
        at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2542)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2178)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2171)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:291)
        at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4956)

2011-08-09 17:28:29,658 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down JobTracker at staruser01.star.bnl.gov/130.199.162.170
************************************************************/
 

Oops - I didn't edit the /mapred-site.xml properly.  Fixed that, then: 

$ bin/stop-all.sh

$ bin/start-all.sh

Looks better:

[wbetts@staruser01 hadoop-0.20.203.0]$ jps
20877 SecondaryNameNode
20739 DataNode
20633 NameNode
20947 JobTracker
21067 TaskTracker
21157 Jps
 

and http://localhost:50030/ redirects to http://localhost:50030/jobtracker.jsp, giving an interface to the state of the map/reduce system (presumably served by Jetty).

Random interlude:

$ bin/hadoop fs -ls /
Found 1 items
drwxr-xr-x   - wbetts supergroup          0 2011-08-09 17:45 /tmp

Now to put some files into HDFS:

$ bin/hadoop fs -put conf input

$ bin/hadoop fs -ls /
Found 2 items
drwxr-xr-x   - wbetts supergroup          0 2011-08-09 17:45 /tmp
drwxr-xr-x   - wbetts supergroup          0 2011-08-09 18:19 /user
 

and drill down under user to get the end result of the "put" command:

$ bin/hadoop fs -ls /user/wbetts/input
Found 22 items
-rw-r--r--   1 wbetts supergroup       7457 2011-08-09 18:19 /user/wbetts/input/capacity-scheduler.xml
-rw-r--r--   1 wbetts supergroup        535 2011-08-09 18:19 /user/wbetts/input/configuration.xsl
-rw-r--r--   1 wbetts supergroup        296 2011-08-09 18:19 /user/wbetts/input/core-site.xml
-rw-r--r--   1 wbetts supergroup        178 2011-08-09 18:19 /user/wbetts/input/core-site.xml.original
-rw-r--r--   1 wbetts supergroup        178 2011-08-09 18:19 /user/wbetts/input/core-site.xml~
-rw-r--r--   1 wbetts supergroup       2282 2011-08-09 18:19 /user/wbetts/input/hadoop-env.sh
-rw-r--r--   1 wbetts supergroup       2237 2011-08-09 18:19 /user/wbetts/input/hadoop-env.sh~
-rw-r--r--   1 wbetts supergroup        498 2011-08-09 18:19 /user/wbetts/input/hadoop-metrics2.properties
-rw-r--r--   1 wbetts supergroup       4644 2011-08-09 18:19 /user/wbetts/input/hadoop-policy.xml
-rw-r--r--   1 wbetts supergroup        274 2011-08-09 18:19 /user/wbetts/input/hdfs-site.xml
-rw-r--r--   1 wbetts supergroup        178 2011-08-09 18:19 /user/wbetts/input/hdfs-site.xml.original
-rw-r--r--   1 wbetts supergroup        178 2011-08-09 18:19 /user/wbetts/input/hdfs-site.xml~
-rw-r--r--   1 wbetts supergroup       4307 2011-08-09 18:19 /user/wbetts/input/log4j.properties
-rw-r--r--   1 wbetts supergroup       2033 2011-08-09 18:19 /user/wbetts/input/mapred-queue-acls.xml
-rw-r--r--   1 wbetts supergroup        287 2011-08-09 18:19 /user/wbetts/input/mapred-site.xml
-rw-r--r--   1 wbetts supergroup        178 2011-08-09 18:19 /user/wbetts/input/mapred-site.xml.original
-rw-r--r--   1 wbetts supergroup        274 2011-08-09 18:19 /user/wbetts/input/mapred-site.xml~
-rw-r--r--   1 wbetts supergroup         10 2011-08-09 18:19 /user/wbetts/input/masters
-rw-r--r--   1 wbetts supergroup         10 2011-08-09 18:19 /user/wbetts/input/slaves
-rw-r--r--   1 wbetts supergroup       1243 2011-08-09 18:19 /user/wbetts/input/ssl-client.xml.example
-rw-r--r--   1 wbetts supergroup       1195 2011-08-09 18:19 /user/wbetts/input/ssl-server.xml.example
-rw-r--r--   1 wbetts supergroup        382 2011-08-09 18:19 /user/wbetts/input/taskcontroller.cfg
 

(If this were a multi-node cluster, what would happen?  Will these files actually reside on other nodes in the cluster at this point?)

And try one of the map-reduce examples from above:

$ bin/hadoop jar hadoop-examples-0.20.203.0.jar grep input output 'the[a-z]+'

a little over 2 minutes later, it is finished.  Tried jps a few times in the interval (didn't think at the time to look at the map-reduce web interface):

# jps
22401 RunJar
20877 SecondaryNameNode
22761 Child
20739 DataNode
20633 NameNode
22629 Child
22801 Jps
20947 JobTracker
22754 Child
21067 TaskTracker
# jps
23245 Child
22401 RunJar
20877 SecondaryNameNode
20739 DataNode
20633 NameNode
22629 Child
20947 JobTracker
23284 Jps
21067 TaskTracker
 

And using the HDFS web interface, can browse through the HDSF and find the results at /usr/wbetts/output/part-00000:

4       they
3       them
3       then
2       there
2       thers
2       therwise
1       ther

Can also read the output from HDFS using:

$ bin/hadoop fs -cat output/*
 

To actually retrieve the output file to my non-HDFS workspace:

$ bin/hadoop fs -get output output_pseudo_multi

$ cat output_pseudo_multi/part-00000

Question:  what could it be besides "part-00000"?  How would I determine if it was something else?

Stopping everything:

$ bin/hadoop stop-all.sh


And now for a 2-node cluster...

adding in moorea.star.bnl.gov.  (Complication perhaps:  my wbetts account on moorea is a BNL domain account via Centrify while my wbetts account on staruser is a local account.  Without a shared user account structure, I expect I won't get far...  If that becomes an issue, I will move to either the onlNN pool or rcas interactive nodes where homogeneity is expected.)

Trying to follow this:

http://hadoop.apache.org/common/docs/r0.20.203.0/cluster_setup.html

Starting from the pseudo-distributed setup on moorea (just like staruser01). 

Using moorea as the NameNode, setting fs.default.name in core-site.xml on each node accordingly.

In hdfs-site.xml on moorea:

dfs.name.dir = /home/wbetts/hadoop/logs

dfs.data.dir = /home/wbetts/hadoop/fs

In hdfs-site.xml on staruser01:

dfs.data.dir = /home/wbetts/hadoop/fs

Using staruser01 as the Job Tracker (why is it recommended that the NameNode and JobTracker be on different nodes?)

What is mapred.system.dir?  "Path on the HDFS where where the MapReduce framework stores system files e.g.

/hadoop/mapred/system/

. This is in the default filesystem (HDFS) and must be accessible from both the server and client machines."  Why would it NOT be accessible from every cluster node if it is in fact in HDFS?  For now, setting it to /hadoop/mapred/system on both nodes.

mapred.local.dir = /tmp/hadoop on both nodes

Ignoring mapred.tasktracker.{map|reduce}.tasks.maximum, dfs.hosts/dfs.hosts.exclude, mapred.hosts/mapred.hosts.exclude, mapred.queue.names (should default to "default, right?) and mapred.acls.enabled

On to mapred-queue-acls.xml on both nodes:

set mapred.queue.default.acl-submit-job and mapred.queue.default.acl-administer-jobs to '*' (any user can submit jobs and can view job details, change the priority or kill jobs that have been submitted).

 

NB.  It is suggested that the various settings above be marked as "final" :

<final>true</final>

but I did not do that.

 

Task Controller:

mapred.task.tracker.task-controller = org.apache.hadoop.mapred.DefaultTaskController in mapred-site.xml on both nodes.  This means that all jobs run as the user who starts the Hadoop daemons.  There is a LinuxTaskController to allow tasks (and created files) to be owned by the submitter, but it requires additional configuration for setuid operations.

Node health monitoring:

The Task Tracker can be configured to periodically run a (administrator-defined) script to check on the health of a node.  There is a specific output format  in the case of problem that will trigger the node to be removed from the pool until the script no longer indicates a problem.  I have not tried this yet.

Memory monitoring:

The Task Trackers can monitor memory consumption and kill jobs over a configured limit.  I have no tried this yet.

Logging:

Edit the conf/log4j.properties file to customize the Hadoop daemons' logging configuration (log-formats and so on). 

Rack awareness:

I think the admin has to prepare a java class to return a rack-id based on the ip address of the machine.  If this is not done, then all nodes are assigned to the "/default-rack".  I have not looked into this.

STARTING THE CLUSTER (finally!):

in conf/slaves on each node, added the other node respectively (and both list localhost).  Then it goes:

---------

[wbetts@moorea hadoop-0.20.203.0]$ bin/start-dfs.sh

starting namenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-namenode-moorea.star.bnl.gov.out
localhost:
localhost:                            NOTICE TO USERS
localhost:
<snip>

localhost: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
localhost: warning.
staruser01.star.bnl.gov:
staruser01.star.bnl.gov:                            NOTICE TO USERS
<snip>

staruser01.star.bnl.gov: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
staruser01.star.bnl.gov: warning.
localhost: starting datanode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-datanode-moorea.star.bnl.gov.out
staruser01.star.bnl.gov: starting datanode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-datanode-staruser01.star.bnl.gov.out
localhost:
localhost:                            NOTICE TO USERS
<snip>

localhost: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
localhost: warning.
localhost: starting secondarynamenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-secondarynamenode-moorea.star.bnl.gov.out

---------

This does NOT appear to have started anything on staruser01 (jps lists nothing running, but on moorea, there is "DataNode" and "Secondary NameNode").  In the logs/hadoop-wbetts-datanode-staruser01.star.bnl.gov.log file, there are errors indicating network communication problems - moorea apparently communicated with staruser01, but staruser01 couldn't communicate with moorea on port 9000 apparently:

2011-08-29 12:26:46,634 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: moorea.star.bnl.gov/130.199.162.173:
9000. Already tried 9 time(s).
2011-08-29 12:26:46,646 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to moorea.star.bnl
.gov/130.199.162.173:9000 failed on local exception: java.net.NoRouteToHostException: No route to host
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
        at org.apache.hadoop.ipc.Client.call(Client.java:1033)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
        at $Proxy5.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:347)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:389)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:319)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:295)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:325)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:268)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1480)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1419)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1437)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1563)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1573)
Caused by: java.net.NoRouteToHostException: No route to host
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:414)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:527)
        at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:187)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1164)
        at org.apache.hadoop.ipc.Client.call(Client.java:1010)
        ... 14 more

2011-08-29 12:26:46,649 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at staruser01.star.bnl.gov/130.199.162.170
************************************************************/
 

So I stopped ("bin/stop-dfs.sh") hadoop on moorea to investigate...

8/31/2011:  port info for Hadoop can be found here:  http://icb.med.cornell.edu/wiki/index.php/Hadoop#Firewall (though not sure if it is up-to-date)

Fow now, allowing all TCP from each node to the other.

[wbetts@moorea hadoop-0.20.203.0]$ bin/start-dfs.sh
starting namenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-namenode-moorea.star.bnl.gov.out
localhost:
localhost:                            NOTICE TO USERS
localhost: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
localhost: warning.
staruser01.star.bnl.gov:
staruser01.star.bnl.gov:                            NOTICE TO USERS
staruser01.star.bnl.gov: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
staruser01.star.bnl.gov: warning.
localhost: starting datanode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-datanode-moorea.star.bnl.gov.out
staruser01.star.bnl.gov: starting datanode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-datanode-staruser01.star.bnl.gov.out
localhost:
localhost:                            NOTICE TO USERS
localhost: LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this
localhost: warning.
localhost: starting secondarynamenode, logging to /home/wbetts/hadoop-0.20.203.0/bin/../logs/hadoop-wbetts-secondarynamenode-moorea.star.bnl.gov.out
[wbetts@moorea hadoop-0.20.203.0]$ jps
3658 DataNode
3778 SecondaryNameNode
3819 Jps
 

[wbetts@staruser01 hadoop-0.20.203.0]$ jps
17623 DataNode
17665 Jps
 

Now I'm confused - there is no NameNode service running?!?  Something is wrong...

Looking at the log file (/home/wbetts/hadoop-0.20.203.0/logs/hadoop-wbetts-namenode-moorea.star.bnl.gov.log):

 

2011-08-31 12:48:49,118 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/wbetts/hadoop/logs does not exist.
2011-08-31 12:48:49,119 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/wbetts/hadoop/logs is in an inconsistent state: storage directory does not exist or
is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:291)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:379)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:353)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
2011-08-31 12:48:49,120 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/wbett
s/hadoop/logs is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:291)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:379)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:353)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
2011-08-31 12:48:49,120 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/wbett
s/hadoop/logs is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:291)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:379)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:353)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)

2011-08-31 12:48:49,120 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at moorea.star.bnl.gov/130.199.162.173
************************************************************/
 

I suspect I didn't initialize the file system on moorea...  so:

[wbetts@moorea hadoop-0.20.203.0]$ bin/hadoop namenode -format
11/08/31 13:05:43 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = moorea.star.bnl.gov/130.199.162.173
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.203.0
STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011
************************************************************/
11/08/31 13:05:43 INFO util.GSet: VM type       = 32-bit
11/08/31 13:05:43 INFO util.GSet: 2% max memory = 17.77875 MB
11/08/31 13:05:43 INFO util.GSet: capacity      = 2^22 = 4194304 entries
11/08/31 13:05:43 INFO util.GSet: recommended=4194304, actual=4194304
11/08/31 13:05:43 INFO namenode.FSNamesystem: fsOwner=wbetts
11/08/31 13:05:43 INFO namenode.FSNamesystem: supergroup=supergroup
11/08/31 13:05:43 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/08/31 13:05:43 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/08/31 13:05:43 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/08/31 13:05:43 INFO namenode.NameNode: Caching file names occuring more than 10 times
11/08/31 13:05:44 INFO common.Storage: Image file of size 112 saved in 0 seconds.
11/08/31 13:05:44 INFO common.Storage: Storage directory /home/wbetts/hadoop/logs has been successfully formatted.
11/08/31 13:05:44 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at moorea.star.bnl.gov/130.199.162.173
************************************************************/
 

which seems to be a good thing...  Here we go again:

[wbetts@moorea hadoop-0.20.203.0]$ bin/stop-dfs.sh (output omitted here, but it has stopped all java threads as expected)

[wbetts@moorea hadoop-0.20.203.0]$ bin/start-dfs.sh (output omitted here, but the result looks good)

[wbetts@moorea hadoop-0.20.203.0]$ jps
4491 Jps
4286 DataNode
4417 SecondaryNameNode
4169 NameNode

[wbetts@staruser01 hadoop-0.20.203.0]$ jps
17983 Jps
17940 DataNode

Going to see if the web interface looks good and try adding/removing some files...

The page on moorea (http://localhost:50070) indicates only 1 live node; staruser isn't listed, so something's wrong.

The log (logs/hadoop-wbetts-datanode-staruser01.star.bnl.gov.log) on staruser01 still indicates a communication problem talking back to moorea.   Stopped HDFS from moorea and allowed all udp between the two nodes in iptables, though I don't think that's necessary.  No improvement upon restart.  Out of ideas for the time being.

9/1/2011 evening:  In desperation, I tried creating passphraseless SSH keys and putting them on each node (and updating authorized_keys on both), but this didn't help.  (Nor did I expect it to, since port 9000 has a listening Java process, not an sshd, but who knows what zany stuff they did...)

Back to the drawing board...

 

Are the Hadoop processes using IPv6?!?!

[root@moorea ~]# netstat -lpn |grep 9000
tcp        0      0 ::ffff:127.0.0.1:9000       :::*                        LISTEN      32613/java

Tantalizingly close to the issue reported here:  http://wiki.apache.org/hadoop/HadoopIPv6 , though net.ipv6.bindv6only =0 already according to sysctl.

Added "alias net-pf-10 off" and "alias ipv6 off" to /etc/modprobe.conf on both nodes, and commented out the localhost ipv6 entry in /etc/hosts.  Rebooted staruser01 and verified ipv6 kernel module is no longer loaded.  Can't reboot moorea because Janet is using it, so sent her an email asking her to reboot it as soon as it is convenient.

 Fingers crossed...  Janet rebooted moorea and indeed IPv6 is no longer active.  The namenode process is now listening on IPv4, but is still only listening on the localhost (127.0.0.1) address, so staruser01 can't connect.  I'm guessing it was getting this from <value>hdfs://localhost:9000</value> in conf/core-site.xml, so editing that to moorea.star.bnl.gov instead of localhost and voila, 2 live data nodes!

Edited conf/hdfs-site.xml to have replication = 2 to see what happens.  The existing files did not get replicated (each still only has one copy), but newly added files are replicated.  To change the replication of existing files, one uses "hadoop fs -setrep <#> <HDFS_file>"

Hadoop top-level command reference here: http://hadoop.apache.org/common/docs/current/commands_manual.html.  The user-level HDFS commands are described here: http://hadoop.apache.org/common/docs/current/file_system_shell.html

 

Note to self:  One thing has tripped me up a couple of times...  the various start and stop commands (and possibly others) rely on SSH connections between the various nodes.  While I set up passphraseless keys for this, I forgot (multiple times) that the host keys have to be previously known (in known_hosts) in order for the SSH connections to work (otherwise they "hang" indefinitely in a non-interactive state waiting for the yes/no acceptance of the new key).  So be sure to connect from the master nodes (NameNode and JobTracker) to all of the nodes in the cluster (including the masters themselves!) at least once interactively before issuing start and stop commands.
 

 

Now to duplicate this on rcas nodes:

This is somewhat different because I have a shared user account and home directory on the rcas nodes, but I don't have root access.  Chose rcas6011 and rcas6012 (somewhat arbitrarily).  Untarred the tarball in ~/hadoop/.  First problem, trying the very first example - it apparently chokes on the '+' character in the path to my user directory (/direct/star+u/wbetts/.../....jar)  If I specify /star/u/wbetts instead, then it works.