Hi,
Just one week since upgrading to 0.20.1, I've been hit twice by NN
crashes. The symptoms were the same. The NN log says:
2009-12-01 12:04:00,420 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from
10.63.118.5
2009-12-01 12:04:00,420 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 73801 Total time for transactions(ms): 222Number of
transactions batched in Syncs: 20359 Number of syncs: 40349
SyncTimes(ms): 12206
2009-12-01 12:04:00,421 FATAL
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Fatal Error : All
storage directories are inaccessible.
2009-12-01 12:04:00,424 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
...
It seems the NN crashed when rolling edit log from the secondary NN. On
the secondary NN side, there were a bunch of "connection rejected"
errors.
Any clue? Thanks for help.
Zhang