FAQ
FileSystem should return location information with byte ranges
--------------------------------------------------------------

Key: HADOOP-2187
URL: https://issues.apache.org/jira/browse/HADOOP-2187
Project: Hadoop
Issue Type: Improvement
Components: fs
Reporter: Owen O'Malley
Fix For: 0.16.0


The FileSystem interface should provide location information with byte ranges rather than a String[][] of locations. I suggest that we deprecate FileSystem.getFileCacheHints and replace it with:
{code}
abstract public class FileSystem {
...
public static class BlockInformation implements Writable {
public BlockInformation(long start, String[] locations) {...}
public String[] getHosts() {...}
public long getStartingOffset() {...}
}
BlockInformation[] getFileLocations(Path f, long start, long length) { ... }
}
{code}
This will allow us to fix the FileInputFormat in map/reduce to make just one call per a file to the name node instead of one per a block.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Doug Cutting (JIRA) at Nov 12, 2007 at 8:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541943 ]

    Doug Cutting commented on HADOOP-2187:
    --------------------------------------

    +1 This sounds like a good change.

    I might instead call the class BlockLocations, and the methods getBlockLocations.

    When we deprecate the existing method ideally we can upgrade all existing implementations with a single back-compatibility implementation on the base class.

    Also, should we refer to hosts by hostname or IP here?
    FileSystem should return location information with byte ranges
    --------------------------------------------------------------

    Key: HADOOP-2187
    URL: https://issues.apache.org/jira/browse/HADOOP-2187
    Project: Hadoop
    Issue Type: Improvement
    Components: fs
    Reporter: Owen O'Malley
    Fix For: 0.16.0


    The FileSystem interface should provide location information with byte ranges rather than a String[][] of locations. I suggest that we deprecate FileSystem.getFileCacheHints and replace it with:
    {code}
    abstract public class FileSystem {
    ...
    public static class BlockInformation implements Writable {
    public BlockInformation(long start, String[] locations) {...}
    public String[] getHosts() {...}
    public long getStartingOffset() {...}
    }
    BlockInformation[] getFileLocations(Path f, long start, long length) { ... }
    }
    {code}
    This will allow us to fix the FileInputFormat in map/reduce to make just one call per a file to the name node instead of one per a block.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Shvachko (JIRA) at Nov 14, 2007 at 11:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542626 ]

    Konstantin Shvachko commented on HADOOP-2187:
    ---------------------------------------------

    I agree, especially since hdfs already calls getBlockLocations(String src, long offset, long length) inside getHints(). See HADOOP-894.
    The return class is called LocatedBlocks. It could be generalized into an interface if that makes sense for other file systems.

    FileSystem should return location information with byte ranges
    --------------------------------------------------------------

    Key: HADOOP-2187
    URL: https://issues.apache.org/jira/browse/HADOOP-2187
    Project: Hadoop
    Issue Type: Improvement
    Components: fs
    Reporter: Owen O'Malley
    Fix For: 0.16.0


    The FileSystem interface should provide location information with byte ranges rather than a String[][] of locations. I suggest that we deprecate FileSystem.getFileCacheHints and replace it with:
    {code}
    abstract public class FileSystem {
    ...
    public static class BlockInformation implements Writable {
    public BlockInformation(long start, String[] locations) {...}
    public String[] getHosts() {...}
    public long getStartingOffset() {...}
    }
    BlockInformation[] getFileLocations(Path f, long start, long length) { ... }
    }
    {code}
    This will allow us to fix the FileInputFormat in map/reduce to make just one call per a file to the name node instead of one per a block.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedNov 10, '07 at 5:24p
activeNov 14, '07 at 11:11p
posts3
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Konstantin Shvachko (JIRA): 3 posts

People

Translate

site design / logo © 2021 Grokbase