FAQ
I'm currently running a hive build from trunk, revision number 911889. I've built a UDTF called map_explode which just emits the key and value of each entry in a map as a row in the result table. The table I'm running it against looks like:

hive> describe mytable;
product string from deserializer
...
interactions map<string,int> from deserializer

If I use the map_explode in the select clause, I get the expected results:

hive> select map_explode(interactions) as (key, value) from mytable where day = '2010-02-18' and hour = 1 limit 10;
...
OK
invite_impression 1
invite_impression 1
invite_impression 1
invite_impression 1
rollout 12
invite_impression 1
invite_impression 1
invite_impression 1
rollout 4
invite_impression 1
Time taken: 22.11 seconds

However, if I try to use LATERAL JOIN to relate the exploded values back to the parent table, like so:

hive> select product, key, sum(value) from mytable LATERAL VIEW map_explode(interactions) interacts as key, value where day = '2010-02-18' and hour = 1 group by product, key;

I get the following error:

FAILED: Unknown exception: null

Looking in hive.log, I see the follow stack trace:

2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) - FAILED: Unknown exception: null
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I peeked at ExprWalkerProcFactory, but couldn't readily see what was causing the problem. Any ideas?

Jason

Search Discussions

  • Yongqiang He at Feb 19, 2010 at 10:50 pm
    Hi Jason,

    This is a known bug, see https://issues.apache.org/jira/browse/HIVE-1056

    You can first disable ppd with ³set hive.optimize.ppd=false;²

    Thanks
    Yongqiang
    On 2/19/10 2:23 PM, "Jason Michael" wrote:

    I¹m currently running a hive build from trunk, revision number 911889. I¹ve
    built a UDTF called map_explode which just emits the key and value of each
    entry in a map as a row in the result table. The table I¹m running it against
    looks like:

    hive> describe mytable;
    product string from deserializer
    ...
    interactions map<string,int> from deserializer

    If I use the map_explode in the select clause, I get the expected results:

    hive> select map_explode(interactions) as (key, value) from mytable where day
    = '2010-02-18' and hour = 1 limit 10;
    ...
    OK
    invite_impression 1
    invite_impression 1
    invite_impression 1
    invite_impression 1
    rollout 12
    invite_impression 1
    invite_impression 1
    invite_impression 1
    rollout 4
    invite_impression 1
    Time taken: 22.11 seconds

    However, if I try to use LATERAL JOIN to relate the exploded values back to
    the parent table, like so:

    hive> select product, key, sum(value) from mytable LATERAL VIEW
    map_explode(interactions) interacts as key, value where day = '2010-02-18' and
    hour = 1 group by product, key;

    I get the following error:

    FAILED: Unknown exception: null

    Looking in hive.log, I see the follow stack trace:

    2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) -
    FAILED: Unknown exception: null
    java.lang.NullPointerException
    at
    org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.proces
    s(ExprWalkerProcFactory.java:87)
    at
    org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispat
    cher.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.j
    ava:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:
    129)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalk
    er.java:103)
    at
    org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprW
    alkerProcFactory.java:273)
    at
    org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(O
    pProcFactory.java:317)
    at
    org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.j
    ava:258)
    at
    org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispat
    cher.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.j
    ava:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:
    129)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalk
    er.java:103)
    at
    org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.ja
    va:103)
    at
    org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
    at
    org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnaly
    zer.java:5758)
    at
    org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnaly
    zer.java:125)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
    ava:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

    I peeked at ExprWalkerProcFactory, but couldn¹t readily see what was causing
    the problem. Any ideas?

    Jason
  • Zheng Shao at Feb 19, 2010 at 10:58 pm
    Jason,

    Do you want to open a JIRA and contrib your map_explode function to Hive?
    That will be greatly appreciated.


    Zheng

    On Fri, Feb 19, 2010 at 2:49 PM, Yongqiang He
    wrote:
    Hi Jason,

    This is a known bug, see https://issues.apache.org/jira/browse/HIVE-1056

    You can first disable ppd with “set hive.optimize.ppd=false;”

    Thanks
    Yongqiang
    On 2/19/10 2:23 PM, "Jason Michael" wrote:

    I’m currently running a hive build from trunk, revision number 911889.  I’ve
    built a UDTF called map_explode which just emits the key and value of each
    entry in a map as a row in the result table.  The table I’m running it
    against looks like:

    hive> describe mytable;
    product    string    from deserializer
    ...
    interactions    map<string,int>    from deserializer

    If I use the map_explode in the select clause, I get the expected results:

    hive> select map_explode(interactions) as (key, value) from mytable where
    day = '2010-02-18' and hour = 1 limit 10;
    ...
    OK
    invite_impression    1
    invite_impression    1
    invite_impression    1
    invite_impression    1
    rollout    12
    invite_impression    1
    invite_impression    1
    invite_impression    1
    rollout    4
    invite_impression    1
    Time taken: 22.11 seconds

    However, if I try to use LATERAL JOIN to relate the exploded values back to
    the parent table, like so:

    hive> select product, key, sum(value) from mytable LATERAL VIEW
    map_explode(interactions) interacts as key, value where day = '2010-02-18'
    and hour = 1 group by product, key;

    I get the following error:

    FAILED: Unknown exception: null

    Looking in hive.log, I see the follow stack trace:

    2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255))
    - FAILED: Unknown exception: null
    java.lang.NullPointerException
    at
    org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87)
    at
    org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at
    org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273)
    at
    org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317)
    at
    org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258)
    at
    org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at
    org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at
    org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103)
    at
    org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
    at
    org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758)
    at
    org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

    I peeked at ExprWalkerProcFactory, but couldn’t readily see what was causing
    the problem.  Any ideas?

    Jason


    --
    Yours,
    Zheng

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedFeb 19, '10 at 10:23p
activeFeb 19, '10 at 10:58p
posts3
users3
websitehive.apache.org

People

Translate

site design / logo © 2021 Grokbase