FAQ
Hey all,

I was trying to copy some data from our cluster on 0.19.2 to a new
cluster on 0.18.3 by using disctp and the hftp:// filesystem.
Everything seemed to be going fine for a few hours, but then a few
tasks failed because a few files got 500 errors when trying to be
read from the 19 cluster. As a result the job died. Now that I'm
trying to restart it, I get this error:

[rapleaf@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/ hdfs://ds-
nn2:7276/cluster-a
09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/
cluster-a
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.net.SocketException: Unexpected end of file from
server
at sun.net.www.http.HttpClient.parseHTTPHeader
(HttpClient.java:769)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.http.HttpClient.parseHTTPHeader
(HttpClient.java:766)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream
(HttpURLConnection.java:1000)
at org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList
(HftpFileSystem.java:183)
at org.apache.hadoop.dfs.HftpFileSystem
$LsParser.getFileStatus(HftpFileSystem.java:193)
at org.apache.hadoop.dfs.HftpFileSystem.getFileStatus
(HftpFileSystem.java:222)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:588)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

I changed nothing at all between the first attempt and the subsequent
failed attempts. The only clues in the namenode log for the 19
cluster are:

2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server: Incorrect
header or version mismatch from 10.100.50.252:47733 got version 47
expected version 2

Anyone have any ideas?

-Bryan

Search Discussions

  • Todd Lipcon at Apr 9, 2009 at 6:57 am
    Hey Bryan,

    Any chance you can get a tshark trace on the 0.19 namenode? Maybe tshark -s
    100000 -w nndump.pcap port 7276

    Also, are the clocks synced on the two machines? The failure of your distcp
    is at 23:32:39, but the namenode log message you posted was 23:29:09. Did
    those messages actually pop out at the same time?

    Thanks
    -Todd
    On Wed, Apr 8, 2009 at 11:39 PM, Bryan Duxbury wrote:

    Hey all,

    I was trying to copy some data from our cluster on 0.19.2 to a new cluster
    on 0.18.3 by using disctp and the hftp:// filesystem. Everything seemed to
    be going fine for a few hours, but then a few tasks failed because a few
    files got 500 errors when trying to be read from the 19 cluster. As a result
    the job died. Now that I'm trying to restart it, I get this error:

    [rapleaf@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/
    hdfs://ds-nn2:7276/cluster-a
    09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
    09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/cluster-a
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.SocketException: Unexpected end of file from server
    at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:766)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at
    sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1000)
    at
    org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:183)
    at
    org.apache.hadoop.dfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:193)
    at
    org.apache.hadoop.dfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:222)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:588)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

    I changed nothing at all between the first attempt and the subsequent
    failed attempts. The only clues in the namenode log for the 19 cluster are:

    2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server: Incorrect header
    or version mismatch from 10.100.50.252:47733 got version 47 expected
    version 2

    Anyone have any ideas?

    -Bryan
  • Bryan Duxbury at Apr 9, 2009 at 3:57 pm
    Ah, nevermind. It turns out that I just shouldn't rely on command
    history so much. I accidentally pointed the hftp:// at the actual
    namenode port, not the namenode HTTP port. It appears to be starting
    a regular copy again.

    -Bryan
    On Apr 8, 2009, at 11:57 PM, Todd Lipcon wrote:

    Hey Bryan,

    Any chance you can get a tshark trace on the 0.19 namenode? Maybe
    tshark -s
    100000 -w nndump.pcap port 7276

    Also, are the clocks synced on the two machines? The failure of
    your distcp
    is at 23:32:39, but the namenode log message you posted was
    23:29:09. Did
    those messages actually pop out at the same time?

    Thanks
    -Todd
    On Wed, Apr 8, 2009 at 11:39 PM, Bryan Duxbury wrote:

    Hey all,

    I was trying to copy some data from our cluster on 0.19.2 to a new
    cluster
    on 0.18.3 by using disctp and the hftp:// filesystem. Everything
    seemed to
    be going fine for a few hours, but then a few tasks failed because
    a few
    files got 500 errors when trying to be read from the 19 cluster.
    As a result
    the job died. Now that I'm trying to restart it, I get this error:

    [rapleaf@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/
    hdfs://ds-nn2:7276/cluster-a
    09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
    09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/
    cluster-a
    With failures, global counters are inaccurate; consider running
    with -i
    Copy failed: java.net.SocketException: Unexpected end of file from
    server
    at sun.net.www.http.HttpClient.parseHTTPHeader
    (HttpClient.java:769)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at sun.net.www.http.HttpClient.parseHTTPHeader
    (HttpClient.java:766)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at
    sun.net.www.protocol.http.HttpURLConnection.getInputStream
    (HttpURLConnection.java:1000)
    at
    org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList
    (HftpFileSystem.java:183)
    at
    org.apache.hadoop.dfs.HftpFileSystem$LsParser.getFileStatus
    (HftpFileSystem.java:193)
    at
    org.apache.hadoop.dfs.HftpFileSystem.getFileStatus
    (HftpFileSystem.java:222)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:
    588)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

    I changed nothing at all between the first attempt and the subsequent
    failed attempts. The only clues in the namenode log for the 19
    cluster are:

    2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server:
    Incorrect header
    or version mismatch from 10.100.50.252:47733 got version 47 expected
    version 2

    Anyone have any ideas?

    -Bryan
  • Jun Rao at Apr 9, 2009 at 5:30 pm
  • Philip Zeyliger at Apr 9, 2009 at 8:56 pm
    There doesn't seem to be. The command line for the JVM is computed in
    org.apache.hadoop.mapred.TaskRunner#run().
    On Thu, Apr 9, 2009 at 10:30 AM, Jun Rao wrote:

    Hi,

    Is there a way to set jvm parameters only for reduce tasks in Hadoop?
    Thanks,

    Jun
    IBM Almaden Research Center
    K55/B1, 650 Harry Road, San Jose, CA 95120-6099

    junrao@almaden.ibm.com
  • Koji Noguchi at Apr 9, 2009 at 3:20 pm
    Bryan,

    hftp://ds-nn1:7276
    hdfs://ds-nn2:7276

    Are you using the same port number for hftp and hdfs?

    Looking at the stack trace, it seems like it failed before starting a
    distcp job.

    Koji

    -----Original Message-----
    From: Bryan Duxbury
    Sent: Wednesday, April 08, 2009 11:40 PM
    To: core-user@hadoop.apache.org
    Subject: Issue distcp'ing from 0.19.2 to 0.18.3

    Hey all,

    I was trying to copy some data from our cluster on 0.19.2 to a new
    cluster on 0.18.3 by using disctp and the hftp:// filesystem.
    Everything seemed to be going fine for a few hours, but then a few
    tasks failed because a few files got 500 errors when trying to be
    read from the 19 cluster. As a result the job died. Now that I'm
    trying to restart it, I get this error:

    [rapleaf@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/ hdfs://ds-
    nn2:7276/cluster-a
    09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
    09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/
    cluster-a
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.SocketException: Unexpected end of file from
    server
    at sun.net.www.http.HttpClient.parseHTTPHeader
    (HttpClient.java:769)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at sun.net.www.http.HttpClient.parseHTTPHeader
    (HttpClient.java:766)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream
    (HttpURLConnection.java:1000)
    at org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList
    (HftpFileSystem.java:183)
    at org.apache.hadoop.dfs.HftpFileSystem
    $LsParser.getFileStatus(HftpFileSystem.java:193)
    at org.apache.hadoop.dfs.HftpFileSystem.getFileStatus
    (HftpFileSystem.java:222)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:588)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

    I changed nothing at all between the first attempt and the subsequent
    failed attempts. The only clues in the namenode log for the 19
    cluster are:

    2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server: Incorrect
    header or version mismatch from 10.100.50.252:47733 got version 47
    expected version 2

    Anyone have any ideas?

    -Bryan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 9, '09 at 6:40a
activeApr 9, '09 at 8:56p
posts6
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase