FAQ
I've 2 questions:
1) how to raise the number of reducers?
2) why are there only 2 bucket files per partition even though I
specified 32 buckets?


I've set the following and don't see an increase in the number of reducers.
set hive.exec.reducers.max=32;
set mapred.reduce.tasks=32;
Could this be because the jobs are too small?

I have a feeling that this is the cause for my having only 2 bucket
files in each partition, inspite of specifing 32 buckets.

-Ajo.

Search Discussions

  • Edward Capriolo at Jan 19, 2011 at 4:05 pm

    On Wed, Jan 19, 2011 at 10:46 AM, Ajo Fod wrote:
    I've 2 questions:
    1) how to raise the number of reducers?
    2) why are there only 2 bucket files per partition even though I
    specified 32 buckets?


    I've set the following and don't see an increase in the number of reducers.
    set hive.exec.reducers.max=32;
    set mapred.reduce.tasks=32;
    Could this be because the jobs are too small?

    I have a feeling that this is the cause for my having only 2 bucket
    files in each partition, inspite of specifing 32 buckets.

    -Ajo.
    I have never tried it you should use:

    set hive.enforce.bucketing = true;

    The number of reducers must equal the number of buckets. This is
    described in the language manual.

    http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL/BucketedTables
  • Ajo Fod at Jan 19, 2011 at 5:01 pm
    The wiki probably needs to be fixed :
    For 32, buckets, I need to set the following flags.
    set hive.merge.mapfiles = false;
    set mapred.map.tasks=32;
    ... the set mapred.reduce.tasks ... is irrelevant.

    The query mechanism should ideally set this automatically !!

    Cheers,
    -Ajo
    On Wed, Jan 19, 2011 at 8:04 AM, Edward Capriolo wrote:
    On Wed, Jan 19, 2011 at 10:46 AM, Ajo Fod wrote:
    I've 2 questions:
    1) how to raise the number of reducers?
    2) why are there only 2 bucket files per partition even though I
    specified 32 buckets?


    I've set the following and don't see an increase in the number of reducers.
    set hive.exec.reducers.max=32;
    set mapred.reduce.tasks=32;
    Could this be because the jobs are too small?

    I have a feeling that this is the cause for my having only 2 bucket
    files in each partition, inspite of specifing 32 buckets.

    -Ajo.
    I have never tried it you should use:

    set hive.enforce.bucketing = true;

    The number of reducers must equal the number of buckets. This is
    described in the language manual.

    http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL/BucketedTables
  • Edward Capriolo at Jan 19, 2011 at 6:34 pm

    On Wed, Jan 19, 2011 at 12:00 PM, Ajo Fod wrote:
    The wiki probably needs to be fixed :
    For 32, buckets, I need to set the following flags.
    set hive.merge.mapfiles = false;
    set mapred.map.tasks=32;
    ... the set mapred.reduce.tasks ... is irrelevant.

    The query mechanism should ideally set this automatically !!

    Cheers,
    -Ajo
    On Wed, Jan 19, 2011 at 8:04 AM, Edward Capriolo wrote:
    On Wed, Jan 19, 2011 at 10:46 AM, Ajo Fod wrote:
    I've 2 questions:
    1) how to raise the number of reducers?
    2) why are there only 2 bucket files per partition even though I
    specified 32 buckets?


    I've set the following and don't see an increase in the number of reducers.
    set hive.exec.reducers.max=32;
    set mapred.reduce.tasks=32;
    Could this be because the jobs are too small?

    I have a feeling that this is the cause for my having only 2 bucket
    files in each partition, inspite of specifing 32 buckets.

    -Ajo.
    I have never tried it you should use:

    set hive.enforce.bucketing = true;

    The number of reducers must equal the number of buckets. This is
    described in the language manual.

    http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL/BucketedTables
    Feel free to update the wiki with the notes that merging map files and
    map only jobs may bucket incorrectly.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJan 19, '11 at 3:49p
activeJan 19, '11 at 6:34p
posts4
users2
websitehive.apache.org

2 users in discussion

Ajo Fod: 2 posts Edward Capriolo: 2 posts

People

Translate

site design / logo © 2022 Grokbase