Grokbase Groups Hive user July 2010
FAQ
Hi,

I have data with complex types (map, struct, array of maps) stored as
a text file. I am able to successfully create an external table based on
this data and further build a lateral view on it:



hive -e 'select rownum, bag_item from complex_text LATERAL VIEW
explode(bagofmap) explodedTable AS bag_item ;'

1 {"k1":"v1","k2":"v2"}

1 {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

2 {"a1":"b1","a2":"b2"}

2 {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}



Here is how I created this table:

CREATE external TABLE if not exists complex_text (

mymap map<string, string>,

mytuple struct<num:int,str:string,dbl:double>,

bagofmap array<map<string,string>>,

rownum int

)

row format DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS
TERMINATED BY '\002'

MAP KEYS TERMINATED BY '\003'



stored as textfile

location '/user/pradeepk/complex_text';



The data contents are (^C stands for ctrl-C .i.e '\003'):

mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv
3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1

mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^
Ca4^Db4^Ca5^Db5^Ca6^Db6^A2



Now I created a table using RCFile for storage based on the above table
as follows:

create table complex_rcfile

stored as RCFile

location '/user/pradeepk/complex_rcfile'

as select

mymap,

mytuple,

bagofmap,

rownum

from

complex_text;



The same query against this table gives incorrect results (nulls for the
rownum column):

hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW
explode(bagofmap) explodedTable AS bag_item ;'

NULL {"a1":"b1","a2":"b2"}

NULL {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

NULL {"k1":"v1","k2":"v2"}

NULL {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}



I have a feeling the delimiters are not being correctly interpreted in
RCFile format. Strangely a non lateral view query works fine:

hive -e 'select rownum, bagofmap from complex_rcfile;'

2
[{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]

1
[{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]



Any pointers?



Thanks,

Pradeep

Search Discussions

  • Pradeep Kamath at Jul 8, 2010 at 10:57 pm
    Any pointers?



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Wednesday, July 07, 2010 4:47 PM
    To: hive-user@hadoop.apache.org
    Subject: Complex types, lateral view and RCFile



    Hi,

    I have data with complex types (map, struct, array of maps) stored as
    a text file. I am able to successfully create an external table based on
    this data and further build a lateral view on it:



    hive -e 'select rownum, bag_item from complex_text LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    1 {"k1":"v1","k2":"v2"}

    1 {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

    2 {"a1":"b1","a2":"b2"}

    2 {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}



    Here is how I created this table:

    CREATE external TABLE if not exists complex_text (

    mymap map<string, string>,

    mytuple struct<num:int,str:string,dbl:double>,

    bagofmap array<map<string,string>>,

    rownum int

    )

    row format DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS
    TERMINATED BY '\002'

    MAP KEYS TERMINATED BY '\003'



    stored as textfile

    location '/user/pradeepk/complex_text';



    The data contents are (^C stands for ctrl-C .i.e '\003'):

    mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv
    3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1

    mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^
    Ca4^Db4^Ca5^Db5^Ca6^Db6^A2



    Now I created a table using RCFile for storage based on the above table
    as follows:

    create table complex_rcfile

    stored as RCFile

    location '/user/pradeepk/complex_rcfile'

    as select

    mymap,

    mytuple,

    bagofmap,

    rownum

    from

    complex_text;



    The same query against this table gives incorrect results (nulls for the
    rownum column):

    hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    NULL {"a1":"b1","a2":"b2"}

    NULL {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

    NULL {"k1":"v1","k2":"v2"}

    NULL {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}



    I have a feeling the delimiters are not being correctly interpreted in
    RCFile format. Strangely a non lateral view query works fine:

    hive -e 'select rownum, bagofmap from complex_rcfile;'

    2
    [{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]

    1
    [{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]



    Any pointers?



    Thanks,

    Pradeep
  • Paul Yang at Jul 8, 2010 at 11:14 pm
    This seems like an issue with the column pruner - can you try 'set hive.optimize.cp=false' and then re-run the query?

    From: Pradeep Kamath
    Sent: Thursday, July 08, 2010 3:57 PM
    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile

    Any pointers?

    Thanks,
    Pradeep

    ________________________________
    From: Pradeep Kamath
    Sent: Wednesday, July 07, 2010 4:47 PM
    To: hive-user@hadoop.apache.org
    Subject: Complex types, lateral view and RCFile

    Hi,
    I have data with complex types (map, struct, array of maps) stored as a text file. I am able to successfully create an external table based on this data and further build a lateral view on it:

    hive -e 'select rownum, bag_item from complex_text LATERAL VIEW explode(bagofmap) explodedTable AS bag_item ;'
    1 {"k1":"v1","k2":"v2"}
    1 {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}
    2 {"a1":"b1","a2":"b2"}
    2 {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

    Here is how I created this table:
    CREATE external TABLE if not exists complex_text (
    mymap map<string, string>,
    mytuple struct<num:int,str:string,dbl:double>,
    bagofmap array<map<string,string>>,
    rownum int
    )
    row format DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002'
    MAP KEYS TERMINATED BY '\003'

    stored as textfile
    location '/user/pradeepk/complex_text';

    The data contents are (^C stands for ctrl-C .i.e '\003'):
    mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1
    mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^Ca4^Db4^Ca5^Db5^Ca6^Db6^A2

    Now I created a table using RCFile for storage based on the above table as follows:
    create table complex_rcfile
    stored as RCFile
    location '/user/pradeepk/complex_rcfile'
    as select
    mymap,
    mytuple,
    bagofmap,
    rownum
    from
    complex_text;

    The same query against this table gives incorrect results (nulls for the rownum column):
    hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW explode(bagofmap) explodedTable AS bag_item ;'
    NULL {"a1":"b1","a2":"b2"}
    NULL {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}
    NULL {"k1":"v1","k2":"v2"}
    NULL {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

    I have a feeling the delimiters are not being correctly interpreted in RCFile format. Strangely a non lateral view query works fine:
    hive -e 'select rownum, bagofmap from complex_rcfile;'
    2 [{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]
    1 [{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]

    Any pointers?

    Thanks,
    Pradeep
  • Pradeep Kamath at Jul 9, 2010 at 12:26 am
    Yes indeed! - it did work fine with that setting - I wonder why it shows
    with RCFile and not with text though?



    ________________________________

    From: Paul Yang
    Sent: Thursday, July 08, 2010 4:14 PM
    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    This seems like an issue with the column pruner - can you try 'set
    hive.optimize.cp=false' and then re-run the query?



    From: Pradeep Kamath
    Sent: Thursday, July 08, 2010 3:57 PM
    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    Any pointers?



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Wednesday, July 07, 2010 4:47 PM
    To: hive-user@hadoop.apache.org
    Subject: Complex types, lateral view and RCFile



    Hi,

    I have data with complex types (map, struct, array of maps) stored as
    a text file. I am able to successfully create an external table based on
    this data and further build a lateral view on it:



    hive -e 'select rownum, bag_item from complex_text LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    1 {"k1":"v1","k2":"v2"}

    1 {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

    2 {"a1":"b1","a2":"b2"}

    2 {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}



    Here is how I created this table:

    CREATE external TABLE if not exists complex_text (

    mymap map<string, string>,

    mytuple struct<num:int,str:string,dbl:double>,

    bagofmap array<map<string,string>>,

    rownum int

    )

    row format DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS
    TERMINATED BY '\002'

    MAP KEYS TERMINATED BY '\003'



    stored as textfile

    location '/user/pradeepk/complex_text';



    The data contents are (^C stands for ctrl-C .i.e '\003'):

    mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv
    3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1

    mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^
    Ca4^Db4^Ca5^Db5^Ca6^Db6^A2



    Now I created a table using RCFile for storage based on the above table
    as follows:

    create table complex_rcfile

    stored as RCFile

    location '/user/pradeepk/complex_rcfile'

    as select

    mymap,

    mytuple,

    bagofmap,

    rownum

    from

    complex_text;



    The same query against this table gives incorrect results (nulls for the
    rownum column):

    hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    NULL {"a1":"b1","a2":"b2"}

    NULL {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

    NULL {"k1":"v1","k2":"v2"}

    NULL {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}



    I have a feeling the delimiters are not being correctly interpreted in
    RCFile format. Strangely a non lateral view query works fine:

    hive -e 'select rownum, bagofmap from complex_rcfile;'

    2
    [{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]

    1
    [{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]



    Any pointers?



    Thanks,

    Pradeep
  • Yongqiang he at Jul 9, 2010 at 2:59 am
    forgot hive-user

    ---------- Forwarded message ----------
    From: yongqiang he <heyongqiangict@gmail.com>
    Date: Thu, Jul 8, 2010 at 7:53 PM
    Subject: Re: Complex types, lateral view and RCFile
    To: Pradeep Kamath <pradeepk@yahoo-inc.com>


    The reason is that RCFile use the column prunning results more
    aggressively (to determine which columns needed to be read). For other
    formats, all columns will be needed.

    Paul, i think you have a jira for this issue?
    On Thu, Jul 8, 2010 at 5:23 PM, Pradeep Kamath wrote:
    Yes indeed! – it did work fine with that setting – I wonder why it shows
    with RCFile and not with text though?



    ________________________________

    From: Paul Yang
    Sent: Thursday, July 08, 2010 4:14 PM

    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    This seems like an issue with the column pruner – can you try ‘set
    hive.optimize.cp=false’ and then re-run the query?



    From: Pradeep Kamath
    Sent: Thursday, July 08, 2010 3:57 PM
    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    Any pointers?



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Wednesday, July 07, 2010 4:47 PM
    To: hive-user@hadoop.apache.org
    Subject: Complex types, lateral view and RCFile



    Hi,

    I have data with complex types (map, struct, array of maps) stored as a
    text file. I am able to successfully create an external table based on this
    data and further build a lateral view on it:



    hive -e 'select rownum, bag_item from complex_text LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    1       {"k1":"v1","k2":"v2"}

    1       {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

    2       {"a1":"b1","a2":"b2"}

    2       {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}



    Here is how I created this table:

    CREATE external TABLE if not exists complex_text (

    mymap map<string, string>,

    mytuple struct<num:int,str:string,dbl:double>,

    bagofmap array<map<string,string>>,

    rownum int

    )

    row format DELIMITED FIELDS TERMINATED BY  '\001' COLLECTION ITEMS
    TERMINATED BY '\002'

    MAP KEYS TERMINATED BY '\003'



    stored as textfile

    location '/user/pradeepk/complex_text';



    The data contents are (^C stands for ctrl-C .i.e ‘\003’):

    mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1

    mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^Ca4^Db4^Ca5^Db5^Ca6^Db6^A2



    Now I created a table using RCFile for storage based on the above table as
    follows:

    create table complex_rcfile

    stored as RCFile

    location '/user/pradeepk/complex_rcfile'

    as select

    mymap,

    mytuple,

    bagofmap,

    rownum

    from

    complex_text;



    The same query against this table gives incorrect results (nulls for the
    rownum column):

    hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    NULL    {"a1":"b1","a2":"b2"}

    NULL    {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

    NULL    {"k1":"v1","k2":"v2"}

    NULL    {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}



    I have a feeling the delimiters are not being correctly interpreted in
    RCFile format. Strangely a non lateral view query works fine:

    hive -e 'select rownum, bagofmap from complex_rcfile;’

    2       [{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]

    1       [{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]



    Any pointers?



    Thanks,

    Pradeep
  • Yongqiang he at Jul 9, 2010 at 3:41 am
    It seem hive-1418 did not solve the problem completely. will open a
    jira for this.
    On Thu, Jul 8, 2010 at 5:23 PM, Pradeep Kamath wrote:
    Yes indeed! – it did work fine with that setting – I wonder why it shows
    with RCFile and not with text though?



    ________________________________

    From: Paul Yang
    Sent: Thursday, July 08, 2010 4:14 PM

    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    This seems like an issue with the column pruner – can you try ‘set
    hive.optimize.cp=false’ and then re-run the query?



    From: Pradeep Kamath
    Sent: Thursday, July 08, 2010 3:57 PM
    To: hive-user@hadoop.apache.org; heyongqiangict@gmail.com
    Subject: RE: Complex types, lateral view and RCFile



    Any pointers?



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Wednesday, July 07, 2010 4:47 PM
    To: hive-user@hadoop.apache.org
    Subject: Complex types, lateral view and RCFile



    Hi,

    I have data with complex types (map, struct, array of maps) stored as a
    text file. I am able to successfully create an external table based on this
    data and further build a lateral view on it:



    hive -e 'select rownum, bag_item from complex_text LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    1       {"k1":"v1","k2":"v2"}

    1       {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}

    2       {"a1":"b1","a2":"b2"}

    2       {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}



    Here is how I created this table:

    CREATE external TABLE if not exists complex_text (

    mymap map<string, string>,

    mytuple struct<num:int,str:string,dbl:double>,

    bagofmap array<map<string,string>>,

    rownum int

    )

    row format DELIMITED FIELDS TERMINATED BY  '\001' COLLECTION ITEMS
    TERMINATED BY '\002'

    MAP KEYS TERMINATED BY '\003'



    stored as textfile

    location '/user/pradeepk/complex_text';



    The data contents are (^C stands for ctrl-C .i.e ‘\003’):

    mymapk1^Cmymapv1^Bmymapk2^Cmymapv2^A1^Bhello^B2.5^Ak1^Dv1^Ck2^Dv2^Bk3^Dv3^Ck4^Dv4^Ck5^Dv5^Ck6^Dv6^A1

    mymapk3^Cmymapv3^Bmymapk4^Cmymapv4^A2^Bbye^B3.5^Aa1^Db1^Ca2^Db2^Ba3^Db3^Ca4^Db4^Ca5^Db5^Ca6^Db6^A2



    Now I created a table using RCFile for storage based on the above table as
    follows:

    create table complex_rcfile

    stored as RCFile

    location '/user/pradeepk/complex_rcfile'

    as select

    mymap,

    mytuple,

    bagofmap,

    rownum

    from

    complex_text;



    The same query against this table gives incorrect results (nulls for the
    rownum column):

    hive -e 'select rownum, bag_item from complex_rcfile LATERAL VIEW
    explode(bagofmap) explodedTable AS bag_item ;'

    NULL    {"a1":"b1","a2":"b2"}

    NULL    {"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}

    NULL    {"k1":"v1","k2":"v2"}

    NULL    {"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}



    I have a feeling the delimiters are not being correctly interpreted in
    RCFile format. Strangely a non lateral view query works fine:

    hive -e 'select rownum, bagofmap from complex_rcfile;’

    2       [{"a1":"b1","a2":"b2"},{"a3":"b3","a4":"b4","a5":"b5","a6":"b6"}]

    1       [{"k1":"v1","k2":"v2"},{"k3":"v3","k4":"v4","k5":"v5","k6":"v6"}]



    Any pointers?



    Thanks,

    Pradeep

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 7, '10 at 11:49p
activeJul 9, '10 at 3:41a
posts6
users3
websitehive.apache.org

People

Translate

site design / logo © 2021 Grokbase