FAQ
Hello everyone,

I have job whose result has only 5 keys but but each key has long list of
values like in 100000's .
What should be best way to deal with it. I feel few of my reducers get over
loaded as two or more keys go to same reduce and hence they have lots of
work to do.

So what should be best way out with this situation?

Pankil

Search Discussions

  • Chandraprakash Bhagtani at Sep 29, 2009 at 6:56 am
    you can write your custom partitioner instead of hash partitioner
    On Sat, Sep 26, 2009 at 6:18 AM, Pankil Doshi wrote:

    Hello everyone,

    I have job whose result has only 5 keys but but each key has long list of
    values like in 100000's .
    What should be best way to deal with it. I feel few of my reducers get over
    loaded as two or more keys go to same reduce and hence they have lots of
    work to do.

    So what should be best way out with this situation?

    Pankil


    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,
  • Amogh Vasekar at Sep 29, 2009 at 10:57 am
    Along with partitioner, try to plug in a combiner. It would provide significant performance gains. Not sure about the algo you use, but might have to tweak that a little to facilitate a combiner.

    Thanks,
    Amogh

    -----Original Message-----
    From: Chandraprakash Bhagtani
    Sent: Tuesday, September 29, 2009 12:25 PM
    To: common-user@hadoop.apache.org
    Cc: core-user@hadoop.apache.org
    Subject: Re: Best Idea to deal with following situation

    you can write your custom partitioner instead of hash partitioner
    On Sat, Sep 26, 2009 at 6:18 AM, Pankil Doshi wrote:

    Hello everyone,

    I have job whose result has only 5 keys but but each key has long list of
    values like in 100000's .
    What should be best way to deal with it. I feel few of my reducers get over
    loaded as two or more keys go to same reduce and hence they have lots of
    work to do.

    So what should be best way out with this situation?

    Pankil


    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 26, '09 at 12:49a
activeSep 29, '09 at 10:57a
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase