|| at Apr 3, 2011 at 6:03 am
On Sun, Apr 3, 2011 at 6:49 AM, maha wrote:
My job is for a Similarity Search application. But, my aim for now is to measure the IO overhead if my mapper.map() opened a sequence file and started to read it record by record with:
I want to make sure that "next" here is IO efficient. Otherwise, I will need to write it myself to be block read then parsed in my program using the "sync" hints.
You can have a look at SequenceFile.Reader class's source code perhaps
- it should clear out all doubts you're having?
what parameter is used for the buffer size?
Records are not loaded into the memory. Records are read using
key/value size informations off the buffered input stream.
You can specify a buffer size while constructing a Reader object for
SequenceFiles, or the "io.file.buffer.size" value is used as a