Hi,
I am using HBase as a source of my MapReduce jobs.
I recently found out that TableInputFormat automatically splits the input
table so that each region of the table will be assigned to a single Map job.
But what I want to do is to split the input table so that user-specified
lines of row will be assigned to each Mapper.
For example, if I set a certain parameter to 100, then each Mapper will get
100 lines from the input Table.
Is there a method for this kind of operation?
Or do I have to modify the getSplits() of
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase?
Any answer or opinion will be much appreciated!!
Ed