|
Ravi Gummadi (JIRA) |
at Jun 18, 2009 at 4:45 am
|
⇧ |
| |
[
https://issues.apache.org/jira/browse/HADOOP-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721060#action_12721060 ]
Ravi Gummadi commented on HADOOP-6051:
--------------------------------------
Only file sizes were checked earlier. But now in trunk, checksums are also checked after checking filesizes.
In any case, if I run the following command multiple times
hadoop distcp -update srcfile destfile
and if destfile doesn't exist, -update should allow the file to be copied only once and from 2nd run onwards it should not copy as the filesizes(and
checksums are same).
But the problem here seems to be it is not comparing the filesizes and checksums of srcfile and destfile. distcp seems to be comparing srcfile with
the path destfile/srcfile(i.e. srcfile in destfile directory), which is wrong.
distcp does not skip copying file if we are updating single file
----------------------------------------------------------------
Key: HADOOP-6051
URL:
https://issues.apache.org/jira/browse/HADOOP-6051Project: Hadoop Core
Issue Type: Bug
Components: tools/distcp
Affects Versions: 0.21.0
Reporter: Ravi Gummadi
Fix For: 0.21.0
distcp doesn't skip copying file when we do -update on single file if the destfile already exists.
When we do
hadoop distcp -update srcfilename destfilename
it seems to be comparing checksums of srcfilename and destfilename/srcfilename and so skip is not done. It should compare checksums of srcfilename and destfilename.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.