Grokbase Groups Hive user March 2011
FAQ
Hi,

I am trying to connect the hive shell running on my laptop to a remote
hadoop / hbase cluster and test out the HBase/Hive integration. I manage to
connect and create the table in hbase from remote Hive shell. I am also
passing the auxpath parameter to the shell (specifying the Hive/HBase
integration related jars). In addition I have copied over these files to
HDFS as well (I am using the user name hadoop - so the jars are stored in
HDFS under /user/hadoop).

However when I fire a query on the HBase table - select * from h1 where
key=12; - the map reduce job launches but the map task fails with the
following error:

----

java.io.IOException: Cannot create an instance of InputSplit class =
org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

----

This basically indicates that the Mapper task is unable to locate the
Hive/HBase storage handler that it requires when running. This happens even
though this has been specified in the auxpath and uploaded to HDFS.

Any ideas/pointers/debug options on what I might be doing wrong? Any help is
much appreciated.

p.s. the exploded jars do get copied too under the taskTracker directory on
the cluster node

Thanks

Search Discussions

  • Edward Capriolo at Mar 16, 2011 at 5:00 pm

    On Wed, Mar 16, 2011 at 12:51 PM, Abhijit Sharma wrote:
    Hi,
    I am trying to connect the hive shell running on my laptop to a remote
    hadoop / hbase cluster and test out the HBase/Hive integration. I manage to
    connect and create the table in hbase from remote Hive shell. I am also
    passing the auxpath parameter to the shell (specifying the Hive/HBase
    integration related jars). In addition I have copied over these files to
    HDFS as well (I am using the user name hadoop - so the jars are stored in
    HDFS under /user/hadoop).
    However when  I fire a query on the HBase table - select * from h1 where
    key=12; - the map reduce job launches but the map task fails with the
    following error:
    ----

    java.io.IOException: Cannot create an instance of InputSplit class =
    org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
    at
    org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

    ----
    This basically indicates that the Mapper task is unable to locate the
    Hive/HBase storage handler that it requires when running. This happens even
    though this has been specified in the auxpath and uploaded to HDFS.
    Any ideas/pointers/debug options on what I might be doing wrong? Any help is
    much appreciated.
    p.s. the exploded jars do get copied too under the taskTracker directory on
    the cluster node
    Thanks
    I have seen this error. This is oddness between hadoop,hive, and
    map/reduce classpaths.

    This is what I do
    mkdir hive_home/auxlib
    cp all hive and hbase jars here.
    Also copy the hbase handler jar to auxlib.

    Auxlib get pushed out by the distributed cache each job and you do not
    need to use ADD_JAR XXXX;

    But that is not enough! DOH! Planning the job and getting the splits
    happen before the map tasks are launched.

    For this i drop all the hbase libs in hadoop_home/lib only on the
    machine that is launching the job.

    You can fiddle around with HADOOP_CLASSPATH and achieve similar results.

    Good luck.
  • Abhijit Sharma at Mar 16, 2011 at 5:22 pm
    Thanks a ton - That worked like a charm. I have been struggling with this
    the whole day! I did not need to specify auxlib or auxpath - Just putting
    the 3 Hive/HBase jars in the HADOOP_HOME/lib on the remote job server worked
    fine. Btw if I use ADD JAR from hive will that obviate the need to put the
    jars in HADOOP_HOME/lib

    I guess this is not the ideal scenario - but at least I can proceed.

    Regards
    Abhijit
    On 16 March 2011 22:29, Edward Capriolo wrote:

    On Wed, Mar 16, 2011 at 12:51 PM, Abhijit Sharma
    wrote:
    Hi,
    I am trying to connect the hive shell running on my laptop to a remote
    hadoop / hbase cluster and test out the HBase/Hive integration. I manage to
    connect and create the table in hbase from remote Hive shell. I am also
    passing the auxpath parameter to the shell (specifying the Hive/HBase
    integration related jars). In addition I have copied over these files to
    HDFS as well (I am using the user name hadoop - so the jars are stored in
    HDFS under /user/hadoop).
    However when I fire a query on the HBase table - select * from h1 where
    key=12; - the map reduce job launches but the map task fails with the
    following error:
    ----

    java.io.IOException: Cannot create an instance of InputSplit class =
    org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
    at
    org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

    ----
    This basically indicates that the Mapper task is unable to locate the
    Hive/HBase storage handler that it requires when running. This happens even
    though this has been specified in the auxpath and uploaded to HDFS.
    Any ideas/pointers/debug options on what I might be doing wrong? Any help is
    much appreciated.
    p.s. the exploded jars do get copied too under the taskTracker directory on
    the cluster node
    Thanks
    I have seen this error. This is oddness between hadoop,hive, and
    map/reduce classpaths.

    This is what I do
    mkdir hive_home/auxlib
    cp all hive and hbase jars here.
    Also copy the hbase handler jar to auxlib.

    Auxlib get pushed out by the distributed cache each job and you do not
    need to use ADD_JAR XXXX;

    But that is not enough! DOH! Planning the job and getting the splits
    happen before the map tasks are launched.

    For this i drop all the hbase libs in hadoop_home/lib only on the
    machine that is launching the job.

    You can fiddle around with HADOOP_CLASSPATH and achieve similar results.

    Good luck.


    --
    Regards,
    Abhijit
  • Edward Capriolo at Mar 16, 2011 at 5:29 pm

    On Wed, Mar 16, 2011 at 1:21 PM, Abhijit Sharma wrote:
    Thanks a ton - That worked like a charm. I have been struggling with this
    the whole day! I did not need to specify auxlib or auxpath - Just putting
    the 3 Hive/HBase jars in the HADOOP_HOME/lib on the remote job server worked
    fine. Btw if I use ADD JAR from hive will that obviate the need to put the
    jars in HADOOP_HOME/lib
    I guess this is not the ideal scenario - but at least I can proceed.

    Regards
    Abhijit
    On 16 March 2011 22:29, Edward Capriolo wrote:

    On Wed, Mar 16, 2011 at 12:51 PM, Abhijit Sharma
    wrote:
    Hi,
    I am trying to connect the hive shell running on my laptop to a remote
    hadoop / hbase cluster and test out the HBase/Hive integration. I manage
    to
    connect and create the table in hbase from remote Hive shell. I am also
    passing the auxpath parameter to the shell (specifying the Hive/HBase
    integration related jars). In addition I have copied over these files to
    HDFS as well (I am using the user name hadoop - so the jars are stored
    in
    HDFS under /user/hadoop).
    However when  I fire a query on the HBase table - select * from h1 where
    key=12; - the map reduce job launches but the map task fails with the
    following error:
    ----

    java.io.IOException: Cannot create an instance of InputSplit class =

    org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
    at

    org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

    ----
    This basically indicates that the Mapper task is unable to locate the
    Hive/HBase storage handler that it requires when running. This happens
    even
    though this has been specified in the auxpath and uploaded to HDFS.
    Any ideas/pointers/debug options on what I might be doing wrong? Any
    help is
    much appreciated.
    p.s. the exploded jars do get copied too under the taskTracker directory
    on
    the cluster node
    Thanks
    I have seen this error. This is oddness between hadoop,hive, and
    map/reduce classpaths.

    This is what I do
    mkdir hive_home/auxlib
    cp all hive and hbase jars here.
    Also copy the hbase handler jar to auxlib.

    Auxlib get pushed out by the distributed cache each job and you do not
    need to use ADD_JAR XXXX;

    But that is not enough! DOH! Planning the job and getting the splits
    happen before the map tasks are launched.

    For this i drop all the hbase libs in hadoop_home/lib  only on the
    machine that is launching the job.

    You can fiddle around with HADOOP_CLASSPATH and achieve similar results.

    Good luck.


    --
    Regards,
    Abhijit
    Correct. There are many ways you can make it work. It is mostly a
    matter of preference. I prefer aux_lib so I can avoid running ADD_JAR.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedMar 16, '11 at 4:52p
activeMar 16, '11 at 5:29p
posts4
users2
websitehive.apache.org

2 users in discussion

Abhijit Sharma: 2 posts Edward Capriolo: 2 posts

People

Translate

site design / logo © 2021 Grokbase