I have 2 questions:
1) Is a SequenceFile more efficient than TextFiles for input? ... I think TextFiles will be processed by TextInputFormat into sequenceFiles inside hadoop. So will SequenceFiles (ie.binary input Files) be more efficient ?
2) If I decided to use SequenceFiles as InputFormat, Do I need to stick to the header protocol defined in http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html ?