Grokbase Groups Hive user March 2010
FAQ
I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
0.18.3, Hive 0.4.1, I believe)

I'm trying to run the following query which causes every map task to
fail with an NPE before making any progress:

java.lang.NullPointerException
at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)


The query:
-- Get the node's max price and corresponding year/day/hour/month
select isone.node_id, isone.day, isone.hour, isone.lmp
from (select max(lmp) as mlmp, node_id
from isone_lmp
where isone_lmp.node_id = 400
group by node_id) maxlmp
join isone_lmp isone on ( isone.node_id = maxlmp.node_id
and isone.lmp=maxlmp.mlmp );

The table:
CREATE TABLE isone_lmp (
node_id int,
day string,
hour int,
minute int,
energy float,
congestion float,
loss float,
lmp float
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

The data looks like the following:
396,20090120,00,00,62.77,0,.78,63.55
397,20090120,00,00,62.77,0,.65,63.42
398,20090120,00,00,62.77,0,.65,63.42
399,20090120,00,00,62.77,0,.65,63.42
400,20090120,00,00,62.77,0,.65,63.42
401,20090120,00,00,62.77,0,-1.02,61.75
405,20090120,00,00,62.77,0,.21,62.98

It's about 15GB of data total; I can do a simple "select count(1) from
isone_lmp;" which executes as expected. Any thoughts? I've been able
to execute the same query on a smaller subset of data (2M rows as
opposed to 500M) on a non-distributed setup locally.

Thanks.
-Tom

Search Discussions

  • Zheng Shao at Mar 5, 2010 at 10:05 am
    Do you want to try hive release 0.5.0 or hive trunk?
    We should have provided better error messages here:
    https://issues.apache.org/jira/browse/HIVE-1216

    Zheng
    On Thu, Mar 4, 2010 at 12:34 PM, Tom Nichols wrote:
    I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
    0.18.3, Hive 0.4.1, I believe)

    I'm trying to run the following query which causes every map task to
    fail with an NPE before making any progress:

    java.lang.NullPointerException
    at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
    at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
    at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
    at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)


    The query:
    -- Get the node's max price and corresponding year/day/hour/month
    select isone.node_id, isone.day, isone.hour, isone.lmp
    from (select max(lmp) as mlmp, node_id
    from isone_lmp
    where isone_lmp.node_id = 400
    group by node_id) maxlmp
    join isone_lmp isone on ( isone.node_id = maxlmp.node_id
    and isone.lmp=maxlmp.mlmp );

    The table:
    CREATE TABLE isone_lmp (
    node_id int,
    day string,
    hour int,
    minute int,
    energy float,
    congestion float,
    loss float,
    lmp float
    )
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE;

    The data looks like the following:
    396,20090120,00,00,62.77,0,.78,63.55
    397,20090120,00,00,62.77,0,.65,63.42
    398,20090120,00,00,62.77,0,.65,63.42
    399,20090120,00,00,62.77,0,.65,63.42
    400,20090120,00,00,62.77,0,.65,63.42
    401,20090120,00,00,62.77,0,-1.02,61.75
    405,20090120,00,00,62.77,0,.21,62.98

    It's about 15GB of data total; I can do a simple "select count(1) from
    isone_lmp;" which executes as expected.  Any thoughts?  I've been able
    to execute the same query on a smaller subset of data (2M rows as
    opposed to 500M) on a non-distributed setup locally.

    Thanks.
    -Tom


    --
    Yours,
    Zheng
  • Tom Nichols at Mar 15, 2010 at 3:43 pm
    Just a follow-up here -- when I upgraded to Hive 0.5 everything
    worked... Thanks again for the help.
    On Fri, Mar 5, 2010 at 5:04 AM, Zheng Shao wrote:
    Do you want to try hive release 0.5.0 or hive trunk?
    We should have provided better error messages here:
    https://issues.apache.org/jira/browse/HIVE-1216

    Zheng
    On Thu, Mar 4, 2010 at 12:34 PM, Tom Nichols wrote:
    I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
    0.18.3, Hive 0.4.1, I believe)

    I'm trying to run the following query which causes every map task to
    fail with an NPE before making any progress:

    java.lang.NullPointerException
    at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
    at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
    at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
    at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)


    The query:
    -- Get the node's max price and corresponding year/day/hour/month
    select isone.node_id, isone.day, isone.hour, isone.lmp
    from (select max(lmp) as mlmp, node_id
    from isone_lmp
    where isone_lmp.node_id = 400
    group by node_id) maxlmp
    join isone_lmp isone on ( isone.node_id = maxlmp.node_id
    and isone.lmp=maxlmp.mlmp );

    The table:
    CREATE TABLE isone_lmp (
    node_id int,
    day string,
    hour int,
    minute int,
    energy float,
    congestion float,
    loss float,
    lmp float
    )
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE;

    The data looks like the following:
    396,20090120,00,00,62.77,0,.78,63.55
    397,20090120,00,00,62.77,0,.65,63.42
    398,20090120,00,00,62.77,0,.65,63.42
    399,20090120,00,00,62.77,0,.65,63.42
    400,20090120,00,00,62.77,0,.65,63.42
    401,20090120,00,00,62.77,0,-1.02,61.75
    405,20090120,00,00,62.77,0,.21,62.98

    It's about 15GB of data total; I can do a simple "select count(1) from
    isone_lmp;" which executes as expected.  Any thoughts?  I've been able
    to execute the same query on a smaller subset of data (2M rows as
    opposed to 500M) on a non-distributed setup locally.

    Thanks.
    -Tom


    --
    Yours,
    Zheng
  • Sanjay Sharma at Mar 5, 2010 at 2:09 pm
    Hi,
    Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.

    Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .

    Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.


    Regards,
    Sanjay

    Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

    Follow our updates on www.twitter.com/impetuscalling.

    NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
  • Sonal Goyal at Mar 5, 2010 at 2:42 pm
    Sanjay,

    I use the following:

    create table products_bought (....);
    insert overwrite table products_bought select ... from tableMaster;


    Thanks and Regards,
    Sonal


    On Fri, Mar 5, 2010 at 7:36 PM, Sanjay Sharma
    wrote:
    Hi,
    Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but
    still getting {mismatched input 'AS' expecting EOF} error.

    Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something
    to do with the syntax .

    Any suggestions on what is the correct syntax or whether it is supposed to
    work in 0.5.


    Regards,
    Sanjay

    Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd
    to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and
    wireless domain expertise. Click http://impetus.com/events to know more.

    Follow our updates on www.twitter.com/impetuscalling.

    NOTE: This message may contain information that is confidential,
    proprietary, privileged or otherwise protected by law. The message is
    intended solely for the named addressee. If received in error, please
    destroy and notify the sender. Any use of this email is prohibited when
    received in error. Impetus does not represent, warrant and/or guarantee,
    that the integrity of this communication has been maintained nor that the
    communication is free of errors, virus, interception or interference.
  • Ning Zhang at Mar 5, 2010 at 4:37 pm
    Can you post your query? It should work like

    create table T as select a, b+1 b1, c*2 c2 from S where ...


    You don't need to specify the schema of T cause it is derived from the select-clause. T's column's name is the same as the alias name in the select-clause.

    Thanks,
    Ning
    On Mar 5, 2010, at 6:06 AM, Sanjay Sharma wrote:

    Hi,
    Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.

    Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .

    Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.


    Regards,
    Sanjay

    Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

    Follow our updates on www.twitter.com/impetuscalling.

    NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
  • Sanjay Sharma at Mar 9, 2010 at 5:15 am
    Thanks Ning.
    Works now- had to restart Hadoop cluster to get it running though-could not understand why?

    Regards,
    Sanjay Sharma
    Impetus
    t: +91-120-4363300 Extn 2761

    -----Original Message-----
    From: Ning Zhang
    Sent: Friday, March 05, 2010 10:08 PM
    To: hive-user@hadoop.apache.org
    Subject: Re: CTAS- Hive 0.5.0

    Can you post your query? It should work like

    create table T as select a, b+1 b1, c*2 c2 from S where ...


    You don't need to specify the schema of T cause it is derived from the select-clause. T's column's name is the same as the alias name in the select-clause.

    Thanks,
    Ning
    On Mar 5, 2010, at 6:06 AM, Sanjay Sharma wrote:

    Hi,
    Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.

    Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .

    Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.


    Regards,
    Sanjay

    Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

    Follow our updates on www.twitter.com/impetuscalling.

    NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

    Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

    Follow our updates on www.twitter.com/impetuscalling.

    NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedMar 4, '10 at 8:35p
activeMar 15, '10 at 3:43p
posts7
users5
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase