Grokbase Groups Hive user July 2010
FAQ
Hi Folks

This issue occurs on Hive 0.4 and 0.5. I wanted to wait on opening a JIRA
ticket until I ran it by the community first.

I'm testing Hive 0.5 running on Apache Hadoop 0.20.2 which is using IBM
Java 6 (32 bit x86 Java SR8 : which can be obtained here -
https://www.ibm.com/developerworks/java/jdk/linux/download.html)

To recreate this I'm using the pokes table loaded with data from the
examples directory, per the tutorial and I run the following in the Hive
CLI (bin/hive) : select count(1) from pokes;

This works just fine on Sun/Oracle Java 6, but when I change the
Hadoop-env to point to IBM Java 6 it fails in the Map with the following
exception :

Caused by: java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector
incompatible with
org.apache.hadoop.hive.serde2.objectinspector.primitive.LongObjectInspector
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount$GenericUDAFCountEvaluator.merge(GenericUDAFCount.java:104)
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:113)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:451)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:591)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:500)
... 14 more

Note, the line number in GenericUDAFCount here is off by 4 based on a
couple of LOG.info calls I added for debugging purposes. The net of it is
that it is failing when it attempts to do the following cast in the merge
method:
(LongObjectInspector)inputOI

This is where it gets weird. In SUN Java, this method gets called in the
Reducer. In IBM Java, it gets called in the Mapper. If I use EXPLAIN in
the Hive CLI, the execution plans are identical regardless of which JRE is
being used in Hadoop. In SUN Java, the type for inputOI is a BigInt which
is being derived off of a single column schema called _col0_ in the
reducer (likely the output tuple of the count result) and casts to a Long
with no problem. In IBM Java, this call is happening in the Map and
inputOI is being derived off of what appears to be the first column of the
Spokes table schema, which is an int and is therefore failing when being
cats to a Long. It appears the cast is merely symptomatic of a difference
in the execution plans.

Debugging from this point, really requires someone who understands HIVE
execution plans better than I do. Is there anyone that can help with this
issue? This is really easy to replicate. Download the IBM JDK, mod your
hadoop env to point to the extracted dir of the IBM JDK and do a select
count from any table.

Regards
Steve Watt

Search Discussions

  • Aaron McCurry at Jul 16, 2010 at 11:57 pm
    I ran into this same problem on the IBM jvm... Didn't spend a lot of time
    trying to fix it because we got new hardware where I could run the SUN jvm.
    Sorry.

    Aaron


    On Fri, Jul 16, 2010 at 12:17 PM, Stephen Watt wrote:

    Hi Folks

    This issue occurs on Hive 0.4 and 0.5. I wanted to wait on opening a JIRA
    ticket until I ran it by the community first.

    I'm testing Hive 0.5 running on Apache Hadoop 0.20.2 which is using IBM
    Java 6 (32 bit x86 Java SR8 : which can be obtained here -
    https://www.ibm.com/developerworks/java/jdk/linux/download.html)

    To recreate this I'm using the pokes table loaded with data from the
    examples directory, per the tutorial and I run the following in the Hive CLI
    (bin/hive) : select count(1) from pokes;

    This works just fine on Sun/Oracle Java 6, but when I change the Hadoop-env
    to point to IBM Java 6 it fails in the Map with the following exception :

    Caused by: java.lang.ClassCastException:
    org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector
    incompatible with
    org.apache.hadoop.hive.serde2.objectinspector.primitive.LongObjectInspector
    at
    org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount$GenericUDAFCountEvaluator.merge(GenericUDAFCount.java:104)
    at
    org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:113)
    at
    org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:451)
    at
    org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:591)
    at
    org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:500)
    ... 14 more

    Note, the line number in GenericUDAFCount here is off by 4 based on a
    couple of LOG.info calls I added for debugging purposes. The net of it is
    that it is failing when it attempts to do the following cast in the merge
    method:
    (LongObjectInspector)inputOI

    This is where it gets weird. In SUN Java, this method gets called in the
    Reducer. In IBM Java, it gets called in the Mapper. If I use EXPLAIN in the
    Hive CLI, the execution plans are identical regardless of which JRE is being
    used in Hadoop. In SUN Java, the type for inputOI is a BigInt which is being
    derived off of a single column schema called _col0_ in the reducer (likely
    the output tuple of the count result) and casts to a Long with no problem.
    In IBM Java, this call is happening in the Map and inputOI is being derived
    off of what appears to be the first column of the Spokes table schema, which
    is an int and is therefore failing when being cats to a Long. It appears the
    cast is merely symptomatic of a difference in the execution plans.

    Debugging from this point, really requires someone who understands HIVE
    execution plans better than I do. Is there anyone that can help with this
    issue? This is really easy to replicate. Download the IBM JDK, mod your
    hadoop env to point to the extracted dir of the IBM JDK and do a select
    count from any table.

    Regards
    Steve Watt

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 16, '10 at 4:18p
activeJul 16, '10 at 11:57p
posts2
users2
websitehive.apache.org

2 users in discussion

Stephen Watt: 1 post Aaron McCurry: 1 post

People

Translate

site design / logo © 2023 Grokbase