FAQ
Hello guys,

I have a problem with the table splits generation for a Map Reduce executing
on HBase table. By default, the table splits are the regions, having a
startRow, an endRow and a regionLocation.
What happens if I want to create a split that contains a region plus some
lines from the next one? (I have an user with information spanning 2
regions, but I want to process all the rows in the order in which they are
in HBase, that's why I want the rows of an user to be in the same split for
map reduce).

So, can I create a TableSplit like that? What happens if the 2 regions are
on different region servers (the split has only a field regionLocation)?

Best Regards,
--
Lucian

Search Discussions

  • Lucian Iordache at Jul 4, 2011 at 3:37 pm
    Hi,

    I've understood that the regionLocation is used for the JobTracker to know
    on what region server the task should be executed to have the best data
    localization. So it should not be a problem in my case to use the location
    of the region that has the more data on it.

    So the problem is solved!

    Regards,
    Lucian
    On Mon, Jul 4, 2011 at 5:18 PM, Lucian Iordache wrote:

    Hello guys,

    I have a problem with the table splits generation for a Map Reduce
    executing on HBase table. By default, the table splits are the regions,
    having a startRow, an endRow and a regionLocation.
    What happens if I want to create a split that contains a region plus some
    lines from the next one? (I have an user with information spanning 2
    regions, but I want to process all the rows in the order in which they are
    in HBase, that's why I want the rows of an user to be in the same split for
    map reduce).

    So, can I create a TableSplit like that? What happens if the 2 regions are
    on different region servers (the split has only a field regionLocation)?

    Best Regards,
    --
    Lucian

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedJul 4, '11 at 2:18p
activeJul 4, '11 at 3:37p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Lucian Iordache: 2 posts

People

Translate

site design / logo © 2022 Grokbase