Grokbase Groups Hive user June 2010
FAQ
I would like to control the maximum number of reducers a Hive query has
access to. I have seen cases of Hive using up to 999 reducers, which seems
inefficient (starting and stopping individual reducers), and I'd also like
to cap the resources Hive uses on the cluster. (Investigating the fair use
scheduler as well, which hopefully works well?)

I haven't seen any conclusive settings is the documentation, so what options
are there to throttle Hive (using .4 at the moment)? hive.exec.reducers.max
is mentioned in a JIRA item, but not in Hive documentation. Does it work?

Thanks.

Search Discussions

  • Edward Capriolo at Jun 28, 2010 at 8:12 pm

    On Mon, Jun 28, 2010 at 9:07 AM, Scott Whitecross wrote:
    I would like to control the maximum number of reducers a Hive query has
    access to.  I have seen cases of Hive using up to 999 reducers, which seems
    inefficient (starting and stopping individual reducers), and I'd also like
    to cap the resources Hive uses on the cluster.  (Investigating the fair use
    scheduler as well, which hopefully works well?)
    I haven't seen any conclusive settings is the documentation, so what options
    are there to throttle Hive (using .4 at the moment)?  hive.exec.reducers.max
    is mentioned in a JIRA item, but not in Hive documentation.  Does it work?
    Thanks.
    Scott,

    Where did you find hive4 in someone's attic? Just kidding. It is not
    that old in terms of years but it is older in terms of releases.
    Upgrade to latest release 5.1, 6.0 is nearing release as well so you
    may want to wait or run trunk.
    You can see all the available configuration options in hive-default.xml

    Here are the relevant values. (some/most of these features may not be in hive 4)

    <property>
    <name>hive.exec.reducers.bytes.per.reducer</name>
    <value>1000000000</value>
    <description>size per reducer.The default is 1G, i.e if the input
    size is 10G, it will use 10 reducers.</description>
    </property>

    <property>
    <name>hive.exec.reducers.max</name>
    <value>999</value>
    <description>max number of reducers will be used. If the one
    specified in the configuration parameter mapred.reduce.tasks is
    negative, hive will use this one as the max number of reducers when
    automatically determine number of reducers.</description>
    </property>



    <property>
    <name>hive.merge.size.per.task</name>
    <value>256000000</value>
    <description>Size of merged files at the end of the job</description>
    </property>

    <property>
    <name>hive.merge.size.smallfiles.avgsize</name>
    <value>16000000</value>
    <description>When the average output file size of a job is less than
    this number, Hive will start an additional map-reduce job to merge the
    output files into bigger files. This is only done for map-only jobs
    if hive.merge.mapfiles is true, and for map-reduce jobs if
    hive.merge.mapredfiles is true.</description>
    </property>


    <property>
    <name>hive.merge.mapfiles</name>
    <value>true</value>
    <description>Merge small files at the end of a map-only job</description>
    </property>

    <property>
    <name>hive.merge.mapredfiles</name>
    <value>false</value>
    <description>Merge small files at the end of a map-reduce job</description>
    </property>



    Other performance related settings
    <property>
    <name>hive.exec.compress.intermediate</name>
    <value>false</value>
    <description> This controls whether intermediate files produced by
    hive between multiple map-reduce jobs are compressed. The compression
    codec and other options are determined from hadoop config variables
    mapred.output.compress* </description>
    </property>


    <property>
    <name>hive.exec.parallel</name>
    <value>false</value>
    <description>Whether to execute jobs in parallel</description>
    </property>

    Regards,
    Edward

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJun 28, '10 at 1:08p
activeJun 28, '10 at 8:12p
posts2
users2
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase