FAQ
Hi,

Datanode should be able to connect to Namenode for any progress on
upgrade. Do you see any other errors reported in datanode log? You need
to fix the connection problem first.

Are you comfortable taking tcpdump for Namenode port on the client? I
think client should be trying to reconnect.

Note that it is safe to restart the cluster or just the datanodes before
the upgrade completes.

Raghu.
Open Study wrote:
Also I checked the log of the name node, and found one exception as followed

2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 9000: starting
2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 7 on 9000: starting
2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 8 on 9000: starting
2007-09-13 02:17:25,325 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 9000: starting
2007-09-13 02:17:25,400 INFO org.apache.hadoop.dfs.BlockCrcUpgradeNamenode:
Block CRC Upgrade is still running.
Avg completion of all Datanodes: 0.00% with
0 errors.
2007-09-13 02:17:25,406 WARN org.apache.hadoop.ipc.Server: IPC Server
handler 5 on 9000, call getProtocolVersion(org.apache.hado
op.dfs.ClientProtocol, 14) from 192.168.2.1:53211: output error
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(
SocketChannelImpl.java:125)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(
SocketChannelOutputStream.java:108)
at org.apache.hadoop.ipc.SocketChannelOutputStream.write(
SocketChannelOutputStream.java:89)
at java.io.BufferedOutputStream.flushBuffer(
BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
2007-09-13 02:18:24,921 INFO org.apache.hadoop.dfs.BlockCrcUpgradeNamenode:
Block CRC Upgrade is still running.
Avg completion of all Datanodes: 0.00% with 0 errors.

It seems some thing was going wong on data node side, however the log of one
of the data nodes show it was started, and it was still running as I can
find from the processes list, but some how lost connection with the
name-node.

************************************************************/
2007-09-12 22:23:35,319 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = TE-DN-002/192.168.2.102
STARTUP_MSG: args = []
************************************************************/
2007-09-12 22:23:35,533 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=DataNode, sessi
onId=null
2007-09-12 22:24:35,619 INFO org.apache.hadoop.ipc.RPC: Problem connecting
to server: /192.168.2.1:9000
2007-09-12 22:25:34,878 INFO org.apache.hadoop.dfs.Storage: Recovering
storage directory /home/textd/data/fs/data from previous
upgrade.
2007-09-12 22:25:49,586 INFO org.apache.hadoop.dfs.DataNode:
Distributed upgrade for DataNode version -6 to current LV -7 is
initialized.
2007-09-12 22:25:49,586 INFO org.apache.hadoop.dfs.Storage: Upgrading
storage directory /home/textd/data/fs/data.
old LV = -4; old CTime = 0.
new LV = -7; new CTime = 1189616555276

The hardware configuration was
Namenode: P4D, 3G RAM
3 Datanodes: AMD 64 4000x2, 1G RAM
They worked with hadoop 0.13.1

Any idea or suggestion?

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 7 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 12, '07 at 6:29p
activeSep 12, '07 at 8:00p
posts7
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase