Grokbase Groups Hive user April 2011
FAQ
Hello,

I have been trying to optimize one of my longer running queries using a
MAPJOIN hint. The query is fairly complex and it joins my base table (1+
billion rows) with multiple metadata tables (which are relatively small in
size).

I already use a STREAMTABLE hint for my large table and have provided
multiple MAPJOIN hints for each metadata table.

Everything is smooth sailing till the last step, each map/reduce step
finishes in under 10 minutes, it used to take 4+ hours prior to that. The
last step does not run even a single map successfully and fails with the
following exception:

2011-04-07 19:09:49,254 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.lang.RuntimeException
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:188)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:218)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:81)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:386)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:598)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:347)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
... 4 more

I know this is more like a hadoop side exception but since the jar's
which run the job are auto-generated, I don't know what's the best
place to start debugging this issue. Any pointers ?

Also when adding multiple hint's do I just comma-separate them or is
there something else that I need to take care of i.e.

SELECT /*+ STREAMTABLE(t1), MAPJOIN(t2), MAPJOIN(t3), MAPJOIN(t4) */ .....

Thanks,

Viral

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 8, '11 at 3:07a
activeApr 8, '11 at 3:07a
posts1
users1
websitehive.apache.org

1 user in discussion

Viral Bajaria: 1 post

People

Translate

site design / logo © 2022 Grokbase