FAQ
Hi all,

We are going to hold the second Hive User Group Meeting at 7PM on
3/18/2010 Thursday.

The agenda will be:

* Hive Tutorial: 20 min
* Hive User Case Study: 20 min
* New Features and API: 25 min
JDBC/ODBC and CTAS
UDF/UDAF/UDTF
Create View/HBaseInputFormat
Hive Join Strategy
SerDe

The audience is beginner to intermediate Hive users/developers.

*** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
*** Please RSVP so we can schedule logistics accordingly. ***

--
Yours,
Zheng

Search Discussions

  • Massoud Mazar at Feb 26, 2010 at 9:59 pm
    Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?
  • Zheng Shao at Feb 27, 2010 at 9:24 am
    Hi Mazar,

    We have not tried Hive on Hadoop higher than 0.20 yet.

    However, Hive has the shim infrastructure which makes it easy to port
    to new Hadoop versions.
    Please see the shim directory inside Hive.

    Zheng
    On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar wrote:
    Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?


    --
    Yours,
    Zheng
  • Massoud Mazar at Mar 1, 2010 at 9:09 pm
    Zheng,

    Thanks for answering.
    I've decided to give it (hive 0.50 on hadoop 0.22) a try. I'm a developer, but not a Java developer, so with some initial help I can spend time and work on this.
    Just to start, I modified the ShimLoader.java and copied the same HADOOP_SHIM_CLASSES and JETTY_SHIM_CLASSES from 0.20 to 0.22 to see where it breaks.

    I built and deployed hive 0.50 to a running hadoop 0.22 and did "show tables;" in hive, and I got this:

    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.security.UserGroupInformation: method <init>()V not found
    at org.apache.hadoop.security.UnixUserGroupInformation.(UnixUserGroupInformation.java:271)
    at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:300)
    at org.apache.hadoop.hive.ql.Driver.(CommandProcessorFactory.java:40)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:116)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:187)

    Now, when I look at the UserGroupInformation class in hadoop 0.22 source code, it does not have a parameter-less constructor, but documentation at http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/security/UserGroupInformation.html shows such a constructor.

    Now, my question is: is this something that can be fixed by shims? Or it is a problem with hadoop?

    -----Original Message-----
    From: Zheng Shao
    Sent: Saturday, February 27, 2010 4:24 AM
    To: hive-user@hadoop.apache.org
    Subject: Re: hive 0.50 on hadoop 0.22

    Hi Mazar,

    We have not tried Hive on Hadoop higher than 0.20 yet.

    However, Hive has the shim infrastructure which makes it easy to port
    to new Hadoop versions.
    Please see the shim directory inside Hive.

    Zheng
    On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar wrote:
    Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?


    --
    Yours,
    Zheng
  • Zheng Shao at Mar 2, 2010 at 12:07 am
    Hi Massoud,

    Great work!

    Yes this is exactly the use of shims. When we see an API change across
    hadoop versions, we add a new function to shims interface, and
    implement it in each of the shim.

    For this one, you probably want to wrap the logic in Driver.java into
    a single shim interface function, and implement that function in all
    shim versions.

    Does that make sense?

    Zheng
    On Mon, Mar 1, 2010 at 1:08 PM, Massoud Mazar wrote:
    Zheng,

    Thanks for answering.
    I've decided to give it (hive 0.50 on hadoop 0.22) a try. I'm a developer, but not a Java developer, so with some initial help I can spend time and work on this.
    Just to start, I modified the ShimLoader.java and copied the same HADOOP_SHIM_CLASSES and JETTY_SHIM_CLASSES from 0.20 to 0.22 to see where it breaks.

    I built and deployed hive 0.50 to a running hadoop 0.22 and did "show tables;" in hive, and I got this:

    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.security.UserGroupInformation: method <init>()V not found
    at org.apache.hadoop.security.UnixUserGroupInformation.<init>(UnixUserGroupInformation.java:69)
    at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:271)
    at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:300)
    at org.apache.hadoop.hive.ql.Driver.<init>(Driver.java:243)
    at org.apache.hadoop.hive.ql.processors.CommandProcessorFactory.get(CommandProcessorFactory.java:40)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:116)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:187)

    Now, when I look at the UserGroupInformation class in hadoop 0.22 source code, it does not have a parameter-less constructor, but documentation at http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/security/UserGroupInformation.html shows such a constructor.

    Now, my question is: is this something that can be fixed by shims? Or it is a problem with hadoop?

    -----Original Message-----
    From: Zheng Shao
    Sent: Saturday, February 27, 2010 4:24 AM
    To: hive-user@hadoop.apache.org
    Subject: Re: hive 0.50 on hadoop 0.22

    Hi Mazar,

    We have not tried Hive on Hadoop higher than 0.20 yet.

    However, Hive has the shim infrastructure which makes it easy to port
    to new Hadoop versions.
    Please see the shim directory inside Hive.

    Zheng
    On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar wrote:
    Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?


    --
    Yours,
    Zheng


    --
    Yours,
    Zheng
  • Massoud Mazar at Mar 3, 2010 at 8:51 pm
    Just installed hive 0.50 and HWI does not work. I could not find the .war file:

    [hadoop@centos1 hive]$ bin/hive --service hwi
    ls: /hadoop/hive/lib/hive-hwi-*.war: No such file or directory
    10/03/03 15:42:39 INFO hwi.HWIServer: HWI is starting up
    10/03/03 15:42:39 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
    10/03/03 15:42:39 INFO mortbay.log: jetty-6.1.14
    10/03/03 15:42:39 INFO mortbay.log: Started SocketConnector@192.168.1.22:9999
  • Edward Capriolo at Mar 3, 2010 at 9:07 pm

    On Wed, Mar 3, 2010 at 3:50 PM, Massoud Mazar wrote:
    Just installed hive 0.50 and HWI does not work. I could not find the .war file:

    [hadoop@centos1 hive]$ bin/hive --service hwi
    ls: /hadoop/hive/lib/hive-hwi-*.war: No such file or directory
    10/03/03 15:42:39 INFO hwi.HWIServer: HWI is starting up
    10/03/03 15:42:39 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
    10/03/03 15:42:39 INFO mortbay.log: jetty-6.1.14
    10/03/03 15:42:39 INFO mortbay.log: Started SocketConnector@192.168.1.22:9999
    I feel like a broken record on this :)

    Luckily I think we have this licked for the final time. You must be
    working with 5.0-rc0

    https://issues.apache.org/jira/browse/HIVE-1183

    In any case, you can find the war file in your build directory.
    Move it to hive/lib

    Add this to your hive site if need be.
    <property>
    <name>hive.hwi.war.file</name>
    <value>lib/hive-hwi-0.5.0.war</value>
    </property>

    Edward
  • Zheng Shao at Mar 1, 2010 at 7:58 pm
    We also created a Meetup group in case you prefer to register on meetup.com

    http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

    We are hosting a Hive User Group Meeting, open to all current and
    potential hadoop/hive users.

    Agenda:
    * Hive Tutorial (Carl Steinbach, cloudera): 20 min
    * Hive User Case Study (Eva Tse, netflix): 20 min
    * New Features and API (Hive team, Facebook): 25 min
    JDBC/ODBC and CTAS(Create Table As Select)
    UDF/UDAF/UDTF (User-defined Functions)
    Create View/HBaseInputFormat (Hive and HBase integration)
    Hive Join Strategy (How Hive does the join)
    SerDe (Hive's serialization/deserialization framework)


    Hive is a scalable data warehouse infrastructure built on top of
    Hadoop. It provides tools to enable easy data ETL, a mechanism to put
    structures on the data, and the capability to querying and analysis of
    large data sets stored in Hadoop files. Hive defines a simple SQL-like
    query language, called HiveQL, that enables users familiar with SQL to
    query the data. At the same time, this language also allows
    programmers who are familiar with MapReduce to be able to plug in
    their custom mappers and reducers to perform more sophisticated
    analysis.

    The current largest deployment of Hive is the silver cluster at
    Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
    1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
    storage space. More than 4 TB of compressed data (20+ TB uncompressed)
    are loaded into Hive every day.


    If you'd like to network with fellow Hive/Hadoop users online, feel
    free to find them here:
    http://www.facebook.com/event.php?eid=319237846974



    Zheng
    On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao wrote:
    Hi all,

    We are going to hold the second Hive User Group Meeting at 7PM on
    3/18/2010 Thursday.

    The agenda will be:

    * Hive Tutorial: 20 min
    * Hive User Case Study: 20 min
    * New Features and API: 25 min
    JDBC/ODBC and CTAS
    UDF/UDAF/UDTF
    Create View/HBaseInputFormat
    Hive Join Strategy
    SerDe

    The audience is beginner to intermediate Hive users/developers.

    *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
    *** Please RSVP so we can schedule logistics accordingly. ***

    --
    Yours,
    Zheng


    --
    Yours,
    Zheng
  • Zheng Shao at Mar 15, 2010 at 9:00 pm
    Just a reminder that we have Hive User Group Meeting this Thursday at Facebook.

    Please register on
    http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/ if
    you plan to come.

    Zheng
    On Mon, Mar 1, 2010 at 12:57 PM, Zheng Shao wrote:
    We also created a Meetup group in case you prefer to register on meetup.com

    http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

    We are hosting a Hive User Group Meeting, open to all current and
    potential hadoop/hive users.

    Agenda:
    * Hive Tutorial (Carl Steinbach, cloudera): 20 min
    * Hive User Case Study (Eva Tse, netflix): 20 min
    * New Features and API (Hive team, Facebook): 25 min
    JDBC/ODBC and CTAS(Create Table As Select)
    UDF/UDAF/UDTF (User-defined Functions)
    Create View/HBaseInputFormat (Hive and HBase integration)
    Hive Join Strategy (How Hive does the join)
    SerDe (Hive's serialization/deserialization framework)


    Hive is a scalable data warehouse infrastructure built on top of
    Hadoop. It provides tools to enable easy data ETL, a mechanism to put
    structures on the data, and the capability to querying and analysis of
    large data sets stored in Hadoop files. Hive defines a simple SQL-like
    query language, called HiveQL, that enables users familiar with SQL to
    query the data. At the same time, this language also allows
    programmers who are familiar with MapReduce to be able to plug in
    their custom mappers and reducers to perform more sophisticated
    analysis.

    The current largest deployment of Hive is the silver cluster at
    Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
    1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
    storage space. More than 4 TB of compressed data (20+ TB uncompressed)
    are loaded into Hive every day.


    If you'd like to network with fellow Hive/Hadoop users online, feel
    free to find them here:
    http://www.facebook.com/event.php?eid=319237846974



    Zheng
    On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao wrote:
    Hi all,

    We are going to hold the second Hive User Group Meeting at 7PM on
    3/18/2010 Thursday.

    The agenda will be:

    * Hive Tutorial: 20 min
    * Hive User Case Study: 20 min
    * New Features and API: 25 min
    JDBC/ODBC and CTAS
    UDF/UDAF/UDTF
    Create View/HBaseInputFormat
    Hive Join Strategy
    SerDe

    The audience is beginner to intermediate Hive users/developers.

    *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
    *** Please RSVP so we can schedule logistics accordingly. ***

    --
    Yours,
    Zheng


    --
    Yours,
    Zheng


    --
    Yours,
    Zheng

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedFeb 26, '10 at 9:56p
activeMar 15, '10 at 9:00p
posts9
users3
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase