Owen O'Malley commented on HADOOP-2027:
---------------------------------------
Note that we also need Map/Reduce to use the new method so that it only does one call per a file to get block sizes. This would require that FileSplit have a new constructor that takes an array of locations rather than computing it on demand. The locations do NOT need to be serialized in the read/write fields methods. FileInputFormat should use a single call to getFileLocations rather than the current getSize, getBlockSize, and getFileCacheHints (down in FileSplit).
FileSystem should provide byte ranges for file locations
--------------------------------------------------------
Key: HADOOP-2027
URL: https://issues.apache.org/jira/browse/HADOOP-2027
Project: Hadoop Core
Issue Type: Bug
Components: fs
Reporter: Owen O'Malley
Assignee: lohit vijayarenu
FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
{code}
BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
{code}
and adding
{code}
class BlockLocation implements Writable {
String[] getHosts();
long getOffset();
long getLength();
}
{code}
----------------------------------------------------------
Key: HADOOP-2027
URL: https://issues.apache.org/jira/browse/HADOOP-2027
Project: Hadoop Core
Issue Type: Bug
Components: fs
Reporter: Owen O'Malley
Assignee: lohit vijayarenu
FileSystem's getFileCacheHints should be replaced with something more useful. I'd suggest replacing getFileCacheHints with a new method:
{code}
BlockLocation[] getFileLocations(Path file, long offset, long range) throws IOException;
{code}
and adding
{code}
class BlockLocation implements Writable {
String[] getHosts();
long getOffset();
long getLength();
}
{code}
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.