Konstantin Shvachko commented on HADOOP-2187:
---------------------------------------------
I agree, especially since hdfs already calls getBlockLocations(String src, long offset, long length) inside getHints(). See HADOOP-894.
The return class is called LocatedBlocks. It could be generalized into an interface if that makes sense for other file systems.
FileSystem should return location information with byte ranges
--------------------------------------------------------------
Key: HADOOP-2187
URL: https://issues.apache.org/jira/browse/HADOOP-2187
Project: Hadoop
Issue Type: Improvement
Components: fs
Reporter: Owen O'Malley
Fix For: 0.16.0
The FileSystem interface should provide location information with byte ranges rather than a String[][] of locations. I suggest that we deprecate FileSystem.getFileCacheHints and replace it with:
{code}
abstract public class FileSystem {
...
public static class BlockInformation implements Writable {
public BlockInformation(long start, String[] locations) {...}
public String[] getHosts() {...}
public long getStartingOffset() {...}
}
BlockInformation[] getFileLocations(Path f, long start, long length) { ... }
}
{code}
This will allow us to fix the FileInputFormat in map/reduce to make just one call per a file to the name node instead of one per a block.
----------------------------------------------------------------
Key: HADOOP-2187
URL: https://issues.apache.org/jira/browse/HADOOP-2187
Project: Hadoop
Issue Type: Improvement
Components: fs
Reporter: Owen O'Malley
Fix For: 0.16.0
The FileSystem interface should provide location information with byte ranges rather than a String[][] of locations. I suggest that we deprecate FileSystem.getFileCacheHints and replace it with:
{code}
abstract public class FileSystem {
...
public static class BlockInformation implements Writable {
public BlockInformation(long start, String[] locations) {...}
public String[] getHosts() {...}
public long getStartingOffset() {...}
}
BlockInformation[] getFileLocations(Path f, long start, long length) { ... }
}
{code}
This will allow us to fix the FileInputFormat in map/reduce to make just one call per a file to the name node instead of one per a block.
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.