FAQ
Hi All,

I am trying to copy files from one hadoop cluster to another hadoop cluster
but I am getting following error:

[phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp
hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz \
hdfs://phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
HADOOP_HOME: /home/sgehlot/cnwk-hadoop/hadoop/0.20.1
HADOOP_CONF_DIR: /home/sgehlot/cnwk-hadoop/config/hadoop/0.20.1/conf_rb-dev
*11/04/18 17:12:23 INFO tools.DistCp:
srcPaths=[hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz]
11/04/18 17:12:23 INFO tools.DistCp: destPath=hdfs://
phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
[Fatal Error] :1:186: XML document structures must start and end within the
same entity.
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: invalid xml directory content
* at
org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:239)
at
org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:244)
at
org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:273)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:689)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
Caused by: org.xml.sax.SAXParseException: XML document structures must start
and end within the same entity.
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
at
org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:233)
... 9 more

Any idea why I am getting this.

Thanks,
Sonia

Search Discussions

  • James Seigel at Apr 19, 2011 at 2:16 pm
    Same versions of hadoop in each cluster?

    Sent from my mobile. Please excuse the typos.
    On 2011-04-18, at 6:31 PM, sonia gehlot wrote:

    Hi All,

    I am trying to copy files from one hadoop cluster to another hadoop cluster
    but I am getting following error:

    [phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp
    hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz \
    hdfs://phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    HADOOP_HOME: /home/sgehlot/cnwk-hadoop/hadoop/0.20.1
    HADOOP_CONF_DIR: /home/sgehlot/cnwk-hadoop/config/hadoop/0.20.1/conf_rb-dev
    *11/04/18 17:12:23 INFO tools.DistCp:
    srcPaths=[hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz]
    11/04/18 17:12:23 INFO tools.DistCp: destPath=hdfs://
    phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    [Fatal Error] :1:186: XML document structures must start and end within the
    same entity.
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: invalid xml directory content
    * at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:239)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:244)
    at
    org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:273)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:689)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
    Caused by: org.xml.sax.SAXParseException: XML document structures must start
    and end within the same entity.
    at
    com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:233)
    ... 9 more

    Any idea why I am getting this.

    Thanks,
    Sonia
  • Sonia gehlot at Apr 19, 2011 at 2:20 pm
    Yes same versions of hadoop on both the clusters.


    On Mon, Apr 18, 2011 at 5:42 PM, James Seigel wrote:

    Same versions of hadoop in each cluster?

    Sent from my mobile. Please excuse the typos.
    On 2011-04-18, at 6:31 PM, sonia gehlot wrote:

    Hi All,

    I am trying to copy files from one hadoop cluster to another hadoop cluster
    but I am getting following error:

    [phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp
    hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz \
    hdfs://phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    HADOOP_HOME: /home/sgehlot/cnwk-hadoop/hadoop/0.20.1
    HADOOP_CONF_DIR:
    /home/sgehlot/cnwk-hadoop/config/hadoop/0.20.1/conf_rb-dev
    *11/04/18 17:12:23 INFO tools.DistCp:
    srcPaths=[hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz]
    11/04/18 17:12:23 INFO tools.DistCp: destPath=hdfs://
    phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    [Fatal Error] :1:186: XML document structures must start and end within the
    same entity.
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: invalid xml directory content
    * at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:239)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:244)
    at
    org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:273)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:689)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
    Caused by: org.xml.sax.SAXParseException: XML document structures must start
    and end within the same entity.
    at
    com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:233)
    ... 9 more

    Any idea why I am getting this.

    Thanks,
    Sonia
  • Sonia gehlot at Apr 19, 2011 at 2:37 pm
    Sorry guys, it was typos it works.

    Thanks,
    Sonia
    On Mon, Apr 18, 2011 at 5:45 PM, sonia gehlot wrote:

    Yes same versions of hadoop on both the clusters.



    On Mon, Apr 18, 2011 at 5:42 PM, James Seigel wrote:

    Same versions of hadoop in each cluster?

    Sent from my mobile. Please excuse the typos.
    On 2011-04-18, at 6:31 PM, sonia gehlot wrote:

    Hi All,

    I am trying to copy files from one hadoop cluster to another hadoop cluster
    but I am getting following error:

    [phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp
    hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz \
    hdfs://phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    HADOOP_HOME: /home/sgehlot/cnwk-hadoop/hadoop/0.20.1
    HADOOP_CONF_DIR:
    /home/sgehlot/cnwk-hadoop/config/hadoop/0.20.1/conf_rb-dev
    *11/04/18 17:12:23 INFO tools.DistCp:
    srcPaths=[hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz]
    11/04/18 17:12:23 INFO tools.DistCp: destPath=hdfs://
    phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    [Fatal Error] :1:186: XML document structures must start and end within the
    same entity.
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: invalid xml directory content
    * at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:239)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:244)
    at
    org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:273)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:689)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
    Caused by: org.xml.sax.SAXParseException: XML document structures must start
    and end within the same entity.
    at
    com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:233)
    ... 9 more

    Any idea why I am getting this.

    Thanks,
    Sonia
  • Mapred Learn at May 2, 2011 at 8:41 pm
    Hi,
    Follwing the same email chain, what ports are needed to be open for distcp
    to work between two hadoop clusters ?


    Thanks,


    On Mon, Apr 18, 2011 at 6:02 PM, sonia gehlot wrote:

    Sorry guys, it was typos it works.

    Thanks,
    Sonia

    On Mon, Apr 18, 2011 at 5:45 PM, sonia gehlot <sonia.gehlot@gmail.com
    wrote:
    Yes same versions of hadoop on both the clusters.



    On Mon, Apr 18, 2011 at 5:42 PM, James Seigel wrote:

    Same versions of hadoop in each cluster?

    Sent from my mobile. Please excuse the typos.
    On 2011-04-18, at 6:31 PM, sonia gehlot wrote:

    Hi All,

    I am trying to copy files from one hadoop cluster to another hadoop cluster
    but I am getting following error:

    [phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp
    hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz
    \
    hdfs://phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    HADOOP_HOME: /home/sgehlot/cnwk-hadoop/hadoop/0.20.1
    HADOOP_CONF_DIR:
    /home/sgehlot/cnwk-hadoop/config/hadoop/0.20.1/conf_rb-dev
    *11/04/18 17:12:23 INFO tools.DistCp:
    srcPaths=[hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz]
    11/04/18 17:12:23 INFO tools.DistCp: destPath=hdfs://
    phx1-rb-dev40-pipe1.cnet.com:9000/user/sgehlot
    [Fatal Error] :1:186: XML document structures must start and end
    within
    the
    same entity.
    With failures, global counters are inaccurate; consider running with
    -i
    Copy failed: java.io.IOException: invalid xml directory content
    * at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:239)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:244)
    at
    org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:273)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:689)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
    Caused by: org.xml.sax.SAXParseException: XML document structures must start
    and end within the same entity.
    at
    com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at
    org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:233)
    ... 9 more

    Any idea why I am getting this.

    Thanks,
    Sonia

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 19, '11 at 2:05p
activeMay 2, '11 at 8:41p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase