Grokbase Groups Hive user July 2010
FAQ
We are working with a trunk of hive hive-0.6.0-957988.

Dynamic partitions are working for us in testing, we tested with about
100 dynamic partitions.

For our production run we have about 1000 (offer_id)'s.

HQL="set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partition.pernode=200000;
set hive.exec.max.dynamic.partitions=20000;
insert overwrite table bco PARTITION (dt=20100721,offer)
SELECT * FROM (
SELECT browser_id, country, dma_code,
offer_id from baseline_raw where
gen_date=20100721 and blt='bco' DISTRIBUTE BY offer_id
) X;
"

2010-07-22 22:36:21,499 Stage-1 map = 100%, reduce = 50%[Fatal Error]
Operator FS_6 (id=6): Number of dynamic partitions exceeded
hive.exec.max.dynamic.partitions.pernode.. Killing the job.

As you can see we always die in the reducer with the same exception.
Is it possible that hive.exec.max.dynamic.partitions is not being used
in the reducer code?

Thanks,
Edward

Search Discussions

  • Ning Zhang at Jul 26, 2010 at 4:07 pm
    The fatal error was thrown because the # of dp exceeded the the hive.exec.max.dynamic.partitions.pernode (100). There is a typo (missed 's' in 'partitions') in the tutorial (sorry about that). If you correct the typo in your query, it should work.

    One thing to watch out is that if you increase the parameter to a large value, it could cause unexpected HDFS errors. The reason is that for each dynamic partitions, we need to open at least 1 file. As long as a file is open, there will be one connection open to one of the HDFS data nodes. There is a limit of the max # of simultaneous connections to any data node (configurable, but the default is 256). So you might want to increase that HDFS parameter as well.

    Ning
    On Jul 26, 2010, at 7:49 AM, Edward Capriolo wrote:

    We are working with a trunk of hive hive-0.6.0-957988.

    Dynamic partitions are working for us in testing, we tested with about
    100 dynamic partitions.

    For our production run we have about 1000 (offer_id)'s.

    HQL="set hive.exec.dynamic.partition.mode=nonstrict;
    set hive.exec.max.dynamic.partition.pernode=200000;
    set hive.exec.max.dynamic.partitions=20000;
    insert overwrite table bco PARTITION (dt=20100721,offer)
    SELECT * FROM (
    SELECT browser_id, country, dma_code,
    offer_id from baseline_raw where
    gen_date=20100721 and blt='bco' DISTRIBUTE BY offer_id
    ) X;
    "

    2010-07-22 22:36:21,499 Stage-1 map = 100%, reduce = 50%[Fatal Error]
    Operator FS_6 (id=6): Number of dynamic partitions exceeded
    hive.exec.max.dynamic.partitions.pernode.. Killing the job.

    As you can see we always die in the reducer with the same exception.
    Is it possible that hive.exec.max.dynamic.partitions is not being used
    in the reducer code?

    Thanks,
    Edward
  • Edward Capriolo at Jul 26, 2010 at 7:40 pm

    On Mon, Jul 26, 2010 at 12:08 PM, Ning Zhang wrote:
    The fatal error was thrown because the # of dp exceeded the the hive.exec.max.dynamic.partitions.pernode (100). There is a typo (missed 's' in 'partitions') in the tutorial (sorry about that). If you correct the typo in your query, it should work.

    One thing to watch out is that if you increase the parameter to a large value, it could cause unexpected HDFS errors. The reason is that for each dynamic partitions, we need to open at least 1 file. As long as a file is open, there will be one connection open to one of the HDFS data nodes. There is a limit of the max # of simultaneous connections to any data node (configurable, but the default is 256). So you might want to increase that HDFS parameter as well.

    Ning
    On Jul 26, 2010, at 7:49 AM, Edward Capriolo wrote:

    We are working with a trunk of hive hive-0.6.0-957988.

    Dynamic partitions are working for us in testing, we tested with about
    100 dynamic partitions.

    For our production run we have about 1000 (offer_id)'s.

    HQL="set hive.exec.dynamic.partition.mode=nonstrict;
    set hive.exec.max.dynamic.partition.pernode=200000;
    set hive.exec.max.dynamic.partitions=20000;
    insert overwrite table bco PARTITION (dt=20100721,offer)
    SELECT * FROM (
    SELECT browser_id, country, dma_code,
    offer_id from baseline_raw where
    gen_date=20100721 and blt='bco' DISTRIBUTE BY offer_id
    ) X;
    "

    2010-07-22 22:36:21,499 Stage-1 map = 100%,  reduce = 50%[Fatal Error]
    Operator FS_6 (id=6): Number of dynamic partitions exceeded
    hive.exec.max.dynamic.partitions.pernode.. Killing the job.

    As you can see we always die in the reducer with the same exception.
    Is it possible that hive.exec.max.dynamic.partitions is not being used
    in the reducer code?

    Thanks,
    Edward
    Ning,

    Thank you for the advice. You were right on both counts. Firstly we
    did not pluralize the variable names correctly. Secondly, once we did
    get the variable name correct our datanodes began "xceiverCount 258
    exceeds the limit of concurrent xcievers 256"

    For people who end up following in my "large number of dynamic
    partition" footsteps the property you need to set on all the
    data-nodes is:

    <property>
    <name>dfs.datanode.max.xcievers</name>
    <value>4096</value>
    </property>


    Thanks again, Ning!
    Dynamic Partitions is a very very exciting feature!

    Edward

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 26, '10 at 2:49p
activeJul 26, '10 at 7:40p
posts3
users2
websitehive.apache.org

2 users in discussion

Edward Capriolo: 2 posts Ning Zhang: 1 post

People

Translate

site design / logo © 2022 Grokbase