FAQ
Hi,

I am using HBase as a source of my MapReduce jobs.

I recently found out that TableInputFormat automatically splits the input
table so that each region of the table will be assigned to a single Map job.

But what I want to do is to split the input table so that user-specified
lines of row will be assigned to each Mapper.

For example, if I set a certain parameter to 100, then each Mapper will get
100 lines from the input Table.

Is there a method for this kind of operation?
Or do I have to modify the getSplits() of
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase?

Any answer or opinion will be much appreciated!!

Ed

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 5, '11 at 11:04a
activeJun 5, '11 at 11:04a
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Edward choi: 1 post

People

Translate

site design / logo © 2022 Grokbase