FAQ
Hi!

I'm facing the problem where datanodes are marked as down due to them
being to slow in doing blockreports, which in turn is due to too many
blocks per node. I.e. https://issues.apache.org/jira/browse/HADOOP-4584,
but I can't easily upgrade to 0.21.

So I came up with a possible workaround - run multiple datanode
instances on each physical node, each handling a subset of the disks on
that node. Not sure it will work, but could be worth a try.

So I configured a second datanode on one of my nodes, configured to run
on a different set of ports, and configured the two datanode instances
to use half of the disks each.

However, when starting up this configuration, I get the below exception
(UnregisteredDatanodeException) in the namenode log, and the datanode
then shuts down after reporting the same.

How can I work around this?

Removing VERSION file in data dir does not help, the data file just
exits with an exception about the data dir being in an inconsistent state.

Can I simply edit the VERSION file in the data dir's that are on the new
instance, replacing e.g. the port number that's there with the new,
correct, port number? Or will that confuse datanode or namenode?

Or should I start the datanode with an empty data dir, let it register
with the namenode, immediately shut it down, then use the VERSION file
from the empty datadir as new VERSION file for all the data dirs that
already contain data?

I'm guessing what I'm trying to do would be equivalent to moving disks
from one host to another, something I can imaginen would happen in some
system administrative situations. So what would be the procedure for that?

Any help would be appreciated.

Thanks,
\EF

Full exception in namenode log:


2011-12-07 09:45:16,699 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 9000, call
blockReceived(DatanodeRegistration(10.20.40.14:50011,
storageID=DS-71308762-1
0.20.11.66-50010-1269957604444, infoPort=50081, ipcPort=50021),
[Lorg.apache.hadoop.hdfs.protocol.Block;@3aa57508,
[Ljava.lang.String;@44a67e4c) from 10.20.40.14:58464: er
ror: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data
node 10.20.40.14:50011 is attempting to report storage ID
DS-71308762-10.20.11.66-50010-1269957604
444. Node 10.20.40.14:50010 is expected to serve this storage.
org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node
10.20.40.14:50011 is attempting to report storage ID
DS-71308762-10.20.11.66-50010-1269957604444.
Node 10.20.40.14:50010 is expected to serve this storage.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:3972)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:3388)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:776)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 5 | next ›
Discussion Overview
grouphdfs-user @
categorieshadoop
postedDec 7, '11 at 10:43a
activeDec 7, '11 at 8:21p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase