FAQ
Hadoop distcp tool fails if file path contains special characters + & !
-----------------------------------------------------------------------

Key: HADOOP-3768
URL: https://issues.apache.org/jira/browse/HADOOP-3768
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.17.2, 0.18.0
Reporter: Viraj Bhat


Copying folders containing + & ! characters between hdfs (using hftp) does not work in distcp

For example:
Copying folder "string1+string2" at "namenode.address.com", hftp port myport to "/myotherhome/folder" on "myothermachine" does not work

myothermachine prompt>>> hadoop --config ~/mycluster/ distcp "hftp://namenode.address.com:myport/myhome/dir/string1+string2" /myotherhome/folder/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Error results for hadoop job1:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
08/07/16 00:27:39 INFO tools.DistCp: srcPaths=[hftp://namenode.address.com:myport/myhome/dir/string1+string2]
08/07/16 00:27:39 INFO tools.DistCp: destPath=/myotherhome/folder/
08/07/16 00:27:41 INFO tools.DistCp: srcCount=2
08/07/16 00:27:42 INFO mapred.JobClient: Running job: job1
08/07/16 00:27:43 INFO mapred.JobClient: map 0% reduce 0%
08/07/16 00:27:58 INFO mapred.JobClient: Task Id : attempt_1_m_000000_0, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)

08/07/16 00:28:14 INFO mapred.JobClient: Task Id : attempt_1_m_000000_1, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)

08/07/16 00:28:28 INFO mapred.JobClient: Task Id : attempt_1_m_000000_2, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)

With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1053)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:615)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:764)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:784)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Error log for the map task which failed
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
INFO org.apache.hadoop.tools.DistCp: FAIL string1+string2/myjobtrackermachine.com-joblog.tar.gz : java.io.IOException: Server returned HTTP response code: 500 for URL: http://mymachine.com:myport/streamFile?filename=/myhome/dir/string1+string2/myjobtrackermachine.com-joblog.tar.gz&ugi=myid,mygroup
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1241)
at org.apache.hadoop.dfs.HftpFileSystem.open(HftpFileSystem.java:117)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:371)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:377)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:504)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:279)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Robert Chansler (JIRA) at Aug 8, 2008 at 6:14 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-3768:
    ------------------------------------

    Component/s: dfs
    Hadoop distcp tool fails if file path contains special characters + & !
    -----------------------------------------------------------------------

    Key: HADOOP-3768
    URL: https://issues.apache.org/jira/browse/HADOOP-3768
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.2, 0.18.0
    Reporter: Viraj Bhat

    Copying folders containing + & ! characters between hdfs (using hftp) does not work in distcp
    For example:
    Copying folder "string1+string2" at "namenode.address.com", hftp port myport to "/myotherhome/folder" on "myothermachine" does not work
    myothermachine prompt>>> hadoop --config ~/mycluster/ distcp "hftp://namenode.address.com:myport/myhome/dir/string1+string2" /myotherhome/folder/
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Error results for hadoop job1:
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    08/07/16 00:27:39 INFO tools.DistCp: srcPaths=[hftp://namenode.address.com:myport/myhome/dir/string1+string2]
    08/07/16 00:27:39 INFO tools.DistCp: destPath=/myotherhome/folder/
    08/07/16 00:27:41 INFO tools.DistCp: srcCount=2
    08/07/16 00:27:42 INFO mapred.JobClient: Running job: job1
    08/07/16 00:27:43 INFO mapred.JobClient: map 0% reduce 0%
    08/07/16 00:27:58 INFO mapred.JobClient: Task Id : attempt_1_m_000000_0, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
    08/07/16 00:28:14 INFO mapred.JobClient: Task Id : attempt_1_m_000000_1, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
    08/07/16 00:28:28 INFO mapred.JobClient: Task Id : attempt_1_m_000000_2, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1053)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:615)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:764)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:784)
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Error log for the map task which failed
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    INFO org.apache.hadoop.tools.DistCp: FAIL string1+string2/myjobtrackermachine.com-joblog.tar.gz : java.io.IOException: Server returned HTTP response code: 500 for URL: http://mymachine.com:myport/streamFile?filename=/myhome/dir/string1+string2/myjobtrackermachine.com-joblog.tar.gz&ugi=myid,mygroup
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1241)
    at org.apache.hadoop.dfs.HftpFileSystem.open(HftpFileSystem.java:117)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:371)
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:377)
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:504)
    at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:279)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJul 16, '08 at 1:00a
activeAug 8, '08 at 6:14p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Robert Chansler (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase