FAQ
[ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565919#action_12565919 ]

Owen O'Malley commented on HADOOP-2027:
---------------------------------------

Note that we also need Map/Reduce to use the new method so that it only does one call per a file to get block sizes. This would require that FileSplit have a new constructor that takes an array of locations rather than computing it on demand. The locations do NOT need to be serialized in the read/write fields methods. FileInputFormat should use a single call to getFileLocations rather than the current getSize, getBlockSize, and getFileCacheHints (down in FileSplit).
FileSystem should provide byte ranges for file locations
--------------------------------------------------------

Key: HADOOP-2027
URL: https://issues.apache.org/jira/browse/HADOOP-2027
Project: Hadoop Core
Issue Type: Bug
Components: fs
Reporter: Owen O'Malley
Assignee: lohit vijayarenu

FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
{code}
BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
{code}
and adding
{code}
class BlockLocation implements Writable {
String[] getHosts();
long getOffset();
long getLength();
}
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • lohit vijayarenu (JIRA) at Feb 12, 2008 at 1:41 am
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567908#action_12567908 ]

    lohit vijayarenu commented on HADOOP-2027:
    ------------------------------------------

    I ran sort (twice) on 100 nodes on trunk+this patch. It took 28.4 and 27.3 minutes. Mukund mentioned it took 29.04 min on trunk.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 12, 2008 at 6:28 am
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567980#action_12567980 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375144/HADOOP-2027-1.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 21 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 619 javac compiler warnings (more than the trunk's current 608 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs -1. The patch appears to introduce 3 new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1781/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1781/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1781/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1781/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Feb 12, 2008 at 6:43 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568276#action_12568276 ]

    Owen O'Malley commented on HADOOP-2027:
    ---------------------------------------

    You should:
    1. Not use strings of '*' around your javadoc.
    2. Fill in the javadoc of public methods in BlockLocation.
    3. I'd prefer using String[] in BlockLocation, since the API uses String rather than Text.
    4. FileSystem.getFileBlockLocations should just pass the desired values into the constructor rather than setting them all, same for DFSClient
    5. The indentation in FileInputFormat should bring lines to the open of the paren
    6. Fix the calls to the now deprecated methods.

    Thanks! I'm looking forward to this patch.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 13, 2008 at 10:28 am
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568498#action_12568498 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375464/HADOOP-2027-2.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 30 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 608 javac compiler warnings (more than the trunk's current 604 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs -1. The patch appears to introduce 3 new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1788/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1788/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1788/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1788/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 13, 2008 at 9:07 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568715#action_12568715 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375522/HADOOP-2027-5.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 33 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs -1. The patch appears to cause Findbugs to fail.

    core tests -1. The patch failed core unit tests.

    contrib tests -1. The patch failed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1790/testReport/
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1790/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1790/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 13, 2008 at 11:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568746#action_12568746 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375528/HADOOP-2027-6.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 33 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 605 javac compiler warnings (more than the trunk's current 603 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1791/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1791/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1791/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1791/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch, HADOOP-2027-6.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • lohit vijayarenu (JIRA) at Feb 13, 2008 at 11:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568747#action_12568747 ]

    lohit vijayarenu commented on HADOOP-2027:
    ------------------------------------------

    The javac warnings were expected due to other deprecated APIs.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch, HADOOP-2027-6.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • lohit vijayarenu (JIRA) at Feb 14, 2008 at 12:15 am
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568760#action_12568760 ]

    lohit vijayarenu commented on HADOOP-2027:
    ------------------------------------------

    Found the 2 additional warnings, they were from PhasedFileSystem.java
    [javac] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/java/org/apache/hadoop/mapred/PhasedFileSystem.java:300: warning: [deprecation] getFileCacheHints(org.apache.hadoop.fs.Path,long,long) in org.apache.hadoop.fs.FilterFileSystem has been deprecated> [javac] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/java/org/apache/hadoop/mapred/PhasedFileSystem.java:300: warning: [deprecation] getFileCacheHints(org.apache.hadoop.fs.Path,long,long) in org.apache.hadoop.fs.FileSystem has been deprecated
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch, HADOOP-2027-6.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 28, 2008 at 10:58 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573504#action_12573504 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12376764/HADOOP-2027-9.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 33 new or modified tests.

    patch -1. The patch command could not apply the patch.

    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1866/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch, HADOOP-2027-6.patch, HADOOP-2027-7.patch, HADOOP-2027-8.patch, HADOOP-2027-9.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 29, 2008 at 8:51 am
    [ https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573637#action_12573637 ]

    Hadoop QA commented on HADOOP-2027:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12376793/HADOOP-2027-10.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 33 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 617 javac compiler warnings (more than the trunk's current 614 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests -1. The patch failed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1874/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1874/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1874/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1874/console

    This message is automatically generated.
    FileSystem should provide byte ranges for file locations
    --------------------------------------------------------

    Key: HADOOP-2027
    URL: https://issues.apache.org/jira/browse/HADOOP-2027
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs
    Reporter: Owen O'Malley
    Assignee: lohit vijayarenu
    Attachments: HADOOP-2027-1.patch, HADOOP-2027-10.patch, HADOOP-2027-2.patch, HADOOP-2027-3.patch, HADOOP-2027-4.patch, HADOOP-2027-5.patch, HADOOP-2027-6.patch, HADOOP-2027-7.patch, HADOOP-2027-8.patch, HADOOP-2027-9.patch


    FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
    {code}
    BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
    {code}
    and adding
    {code}
    class BlockLocation implements Writable {
    String[] getHosts();
    long getOffset();
    long getLength();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 5, '08 at 10:41p
activeFeb 29, '08 at 8:51a
posts11
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hadoop QA (JIRA): 11 posts

People

Translate

site design / logo © 2022 Grokbase