FAQ
hi,
I have a basic question. How does partitioning work ?

Following is a scenario I created to put up my question.

i) A parttition function is defined as partitioning map-output based on
aphabetical sorting of the key i.e. a partition for keys starting with 'a',
partition for keys starting with 'b'... partition for keys starting with
'z'. So, it means each map may have atmost 26 partitions ?

ii) What input will Reducer get ? Reducer will get first partition
(partition starting with 'a') of all the maps as it's input ? Does it mean
we will need 26 reduce tasks ?

Any inputs/documents/examples on this are appreciated. I am bit confused by
this.

Thanks in advance

Search Discussions

  • David Rosenstrauch at Jan 18, 2011 at 8:33 pm

    On 01/18/2011 03:09 PM, Mapred Learn wrote:
    hi,
    I have a basic question. How does partitioning work ?

    Following is a scenario I created to put up my question.

    i) A parttition function is defined as partitioning map-output based on
    aphabetical sorting of the key i.e. a partition for keys starting with 'a',
    partition for keys starting with 'b'... partition for keys starting with
    'z'. So, it means each map may have atmost 26 partitions ?

    ii) What input will Reducer get ? Reducer will get first partition
    (partition starting with 'a') of all the maps as it's input ? Does it mean
    we will need 26 reduce tasks ?

    Any inputs/documents/examples on this are appreciated. I am bit confused by
    this.

    Thanks in advance
    You should probably read the Yahoo tutorial to brush up on the topic
    before asking on the list.

    http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning

    If you still don't understand after that, and you post a specific
    question (i.e., not "how does partitioning work") I'm sure someone will
    be able to answer.

    DR
  • Steve Lewis at Jan 18, 2011 at 8:33 pm
    1) you need not have 26 reducers but you want 26 partitions - you might send
    to
    int reducer = character % Math.min(26,nreducers); // this insures that
    all items with A go to a single reducer but with less than 26 reducers some
    reducers will get other characters -
    The scheme is inefficient as some letters are much more common than
    others so some reducers will get a lot more data - you also cannot use more
    than 26 reducers with this scheme.

    It is a good idea to think carefully about why you want all items with A
    to go to a single reducer
    On Tue, Jan 18, 2011 at 12:09 PM, Mapred Learn wrote:

    hi,
    I have a basic question. How does partitioning work ?

    Following is a scenario I created to put up my question.

    i) A parttition function is defined as partitioning map-output based on
    aphabetical sorting of the key i.e. a partition for keys starting with 'a',
    partition for keys starting with 'b'... partition for keys starting with
    'z'. So, it means each map may have atmost 26 partitions ?

    ii) What input will Reducer get ? Reducer will get first partition
    (partition starting with 'a') of all the maps as it's input ? Does it mean
    we will need 26 reduce tasks ?

    Any inputs/documents/examples on this are appreciated. I am bit confused by
    this.

    Thanks in advance


    --
    Steven M. Lewis PhD
    4221 105th Ave Ne
    Kirkland, WA 98033
    206-384-1340 (cell)
    Institute for Systems Biology
    Seattle WA

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedJan 18, '11 at 8:25p
activeJan 18, '11 at 8:33p
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase