FAQ
Hello,

In my application I need to reduce the original reducer output keys further.

I was reading about Chainreducer and Chainmappers but looks like it is for :
one or more mapper -> reducer -> 0 or more mappers

I need something like:
one or more mapper -> reducer -> reducer

Please help me figure out the best way to achieve it. Currently, the only
options seems like I write another map reduce application and run it
separately after the first map-reduce application. In this second
application, the mapper will be dummy and won't do anything. The reducer
will further club the first run outputs.

Any other comments such as this is not a good programming practice are
welcome, so that I know I am in the wrong direction..
--
View this message in context: http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

  • Aaron Kimball at Oct 23, 2009 at 2:55 am
    If you need another shuffle after your first reduce pass, then you need a
    second MapReduce job to run after the first one. Just use an IdentityMapper.

    This is a reasonably common situation.
    - Aaron
    On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop wrote:


    Hello,

    In my application I need to reduce the original reducer output keys
    further.

    I was reading about Chainreducer and Chainmappers but looks like it is for
    :
    one or more mapper -> reducer -> 0 or more mappers

    I need something like:
    one or more mapper -> reducer -> reducer

    Please help me figure out the best way to achieve it. Currently, the only
    options seems like I write another map reduce application and run it
    separately after the first map-reduce application. In this second
    application, the mapper will be dummy and won't do anything. The reducer
    will further club the first run outputs.

    Any other comments such as this is not a good programming practice are
    welcome, so that I know I am in the wrong direction..
    --
    View this message in context:
    http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Amandeep Khurana at Oct 23, 2009 at 2:59 am
    If you haven't already done so, you can also explore using combiners.
    Not sure if that'll solve your problem since all your k,v pairs for a
    given key k won't get aggregated at one place...
    On 10/22/09, Aaron Kimball wrote:
    If you need another shuffle after your first reduce pass, then you need a
    second MapReduce job to run after the first one. Just use an IdentityMapper.

    This is a reasonably common situation.
    - Aaron
    On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop wrote:


    Hello,

    In my application I need to reduce the original reducer output keys
    further.

    I was reading about Chainreducer and Chainmappers but looks like it is for
    :
    one or more mapper -> reducer -> 0 or more mappers

    I need something like:
    one or more mapper -> reducer -> reducer

    Please help me figure out the best way to achieve it. Currently, the only
    options seems like I write another map reduce application and run it
    separately after the first map-reduce application. In this second
    application, the mapper will be dummy and won't do anything. The reducer
    will further club the first run outputs.

    Any other comments such as this is not a good programming practice are
    welcome, so that I know I am in the wrong direction..
    --
    View this message in context:
    http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz
  • Amogh Vasekar at Oct 23, 2009 at 5:20 pm
    Hi,
    On what parameters does the output key of your (first) reducer depend?

    Amogh

    On 10/23/09 8:24 AM, "Aaron Kimball" wrote:

    If you need another shuffle after your first reduce pass, then you need a
    second MapReduce job to run after the first one. Just use an IdentityMapper.

    This is a reasonably common situation.
    - Aaron
    On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop wrote:


    Hello,

    In my application I need to reduce the original reducer output keys
    further.

    I was reading about Chainreducer and Chainmappers but looks like it is for
    :
    one or more mapper -> reducer -> 0 or more mappers

    I need something like:
    one or more mapper -> reducer -> reducer

    Please help me figure out the best way to achieve it. Currently, the only
    options seems like I write another map reduce application and run it
    separately after the first map-reduce application. In this second
    application, the mapper will be dummy and won't do anything. The reducer
    will further club the first run outputs.

    Any other comments such as this is not a good programming practice are
    welcome, so that I know I am in the wrong direction..
    --
    View this message in context:
    http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 22, '09 at 11:17p
activeOct 23, '09 at 5:20p
posts4
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase