Grokbase Groups Pig user July 2010
FAQ
Hi all

I am trying to load the Zebra file generated from BasicTableOutputFormat in
the MapReduce code. The code is similar with
org.apache.hadoop.zebra.mapred.TableMapReduceExample. But it throws
following exceptions while it splits the data in TableInputFormat:

Exception in thread "main" java.io.IOException: BasicTable.Reader
constructor failed : Missing Meta File of t_table/CG0/.meta
at
org.apache.hadoop.zebra.io.BasicTable$Reader.(BasicTable.java:287)
at
org.apache.hadoop.zebra.mapred.TableInputFormat.getSplits(TableInputFormat.java:883)

The directory generated from BasicTableOutputFormat contains the following
files (without .meta)

/table/.btschema
/table/CG0
/table/CG0/.schema
/table/CG0/part-0
/table/CG1
/table/CG1/.schema
/table/CG1/part-0
/table/_temporary
/table/_temporary/CG0
/table/_temporary/CG1

The same erroe occurs if I store and then load the data in Pig interface
(missing .meta file).

How can I transfer the raw data into zebra format and then load them in Pig
or MR program? Any suggestions would be appreciated!

-
Regards
Yuting

Search Discussions

  • Chao Wang at Jul 6, 2010 at 5:13 am
    Make sure you call "BasicTableOutputFormat.close()" at the end of your
    m/r job. It will create .meta for Zebra tables.

    Chao


    -----Original Message-----
    From: Yuting Lin
    Sent: Monday, July 05, 2010 9:03 PM
    To: pig-user@hadoop.apache.org
    Subject: zebra TableInputFormat errors: Missing Meta File .meta

    Hi all

    I am trying to load the Zebra file generated from BasicTableOutputFormat
    in
    the MapReduce code. The code is similar with
    org.apache.hadoop.zebra.mapred.TableMapReduceExample. But it throws
    following exceptions while it splits the data in TableInputFormat:

    Exception in thread "main" java.io.IOException: BasicTable.Reader
    constructor failed : Missing Meta File of t_table/CG0/.meta
    at
    org.apache.hadoop.zebra.io.BasicTable$Reader.(BasicTable.java:287)
    at
    org.apache.hadoop.zebra.mapred.TableInputFormat.getSplits(TableInputForm
    at.java:883)

    The directory generated from BasicTableOutputFormat contains the
    following
    files (without .meta)

    /table/.btschema
    /table/CG0
    /table/CG0/.schema
    /table/CG0/part-0
    /table/CG1
    /table/CG1/.schema
    /table/CG1/part-0
    /table/_temporary
    /table/_temporary/CG0
    /table/_temporary/CG1

    The same erroe occurs if I store and then load the data in Pig interface
    (missing .meta file).

    How can I transfer the raw data into zebra format and then load them in
    Pig
    or MR program? Any suggestions would be appreciated!

    -
    Regards
    Yuting
  • Yuting Lin at Jul 6, 2010 at 5:30 am
    Thanks Chao

    It works in the m/r code. Thanks.

    Regards
    Yuting
    On Tue, Jul 6, 2010 at 1:12 PM, Chao Wang wrote:

    Make sure you call "BasicTableOutputFormat.close()" at the end of your
    m/r job. It will create .meta for Zebra tables.

    Chao


    -----Original Message-----
    From: Yuting Lin
    Sent: Monday, July 05, 2010 9:03 PM
    To: pig-user@hadoop.apache.org
    Subject: zebra TableInputFormat errors: Missing Meta File .meta

    Hi all

    I am trying to load the Zebra file generated from BasicTableOutputFormat
    in
    the MapReduce code. The code is similar with
    org.apache.hadoop.zebra.mapred.TableMapReduceExample. But it throws
    following exceptions while it splits the data in TableInputFormat:

    Exception in thread "main" java.io.IOException: BasicTable.Reader
    constructor failed : Missing Meta File of t_table/CG0/.meta
    at
    org.apache.hadoop.zebra.io.BasicTable$Reader.<init>(BasicTable.java:328)
    at
    org.apache.hadoop.zebra.io.BasicTable$Reader.<init>(BasicTable.java:287)
    at
    org.apache.hadoop.zebra.mapred.TableInputFormat.getSplits(TableInputForm
    at.java:883)

    The directory generated from BasicTableOutputFormat contains the
    following
    files (without .meta)

    /table/.btschema
    /table/CG0
    /table/CG0/.schema
    /table/CG0/part-0
    /table/CG1
    /table/CG1/.schema
    /table/CG1/part-0
    /table/_temporary
    /table/_temporary/CG0
    /table/_temporary/CG1

    The same erroe occurs if I store and then load the data in Pig interface
    (missing .meta file).

    How can I transfer the raw data into zebra format and then load them in
    Pig
    or MR program? Any suggestions would be appreciated!

    -
    Regards
    Yuting
  • Brian Adams at Jul 6, 2010 at 6:13 pm
    joined = JOIN webordered BY ngram FULL OUTER, smsordered BY ngram;
    these are basically (ngram,count) joined to (ngram,count)
    In the case where Dog is webordered but not smsordered, I still want to
    put 0 in the count column so i can eventually do a sum column.

    I get something like this.

    dog, 500, dog,10000
    cat,500,(nothing)
    (nothing),mouse,100

    If i wanted to do a sum column like
    FOREACH GENERATE $0,$1,$2,$3,($1+$3) as thesum

    thesum will be blank in the case of cat, or mouse above.

    How do I work around this?

    I was trying conditionals like so:
    FOREACH joined GENERATE $0 AS smsgram,($0=='null'?(int)0:$1) AS
    smscount,$2 AS webgram,($2=='null'?(int)0:$3) AS webcount,($1+$3) AS
    sumcount;

    Thanks ahead of time guys and gals.

    On Tue, 2010-07-06 at 13:30 +0800, Yuting Lin wrote:
    Thanks Chao

    It works in the m/r code. Thanks.

    Regards
    Yuting
    On Tue, Jul 6, 2010 at 1:12 PM, Chao Wang wrote:

    Make sure you call "BasicTableOutputFormat.close()" at the end of your
    m/r job. It will create .meta for Zebra tables.

    Chao


    -----Original Message-----
    From: Yuting Lin
    Sent: Monday, July 05, 2010 9:03 PM
    To: pig-user@hadoop.apache.org
    Subject: zebra TableInputFormat errors: Missing Meta File .meta

    Hi all

    I am trying to load the Zebra file generated from BasicTableOutputFormat
    in
    the MapReduce code. The code is similar with
    org.apache.hadoop.zebra.mapred.TableMapReduceExample. But it throws
    following exceptions while it splits the data in TableInputFormat:

    Exception in thread "main" java.io.IOException: BasicTable.Reader
    constructor failed : Missing Meta File of t_table/CG0/.meta
    at
    org.apache.hadoop.zebra.io.BasicTable$Reader.<init>(BasicTable.java:328)
    at
    org.apache.hadoop.zebra.io.BasicTable$Reader.<init>(BasicTable.java:287)
    at
    org.apache.hadoop.zebra.mapred.TableInputFormat.getSplits(TableInputForm
    at.java:883)

    The directory generated from BasicTableOutputFormat contains the
    following
    files (without .meta)

    /table/.btschema
    /table/CG0
    /table/CG0/.schema
    /table/CG0/part-0
    /table/CG1
    /table/CG1/.schema
    /table/CG1/part-0
    /table/_temporary
    /table/_temporary/CG0
    /table/_temporary/CG1

    The same erroe occurs if I store and then load the data in Pig interface
    (missing .meta file).

    How can I transfer the raw data into zebra format and then load them in
    Pig
    or MR program? Any suggestions would be appreciated!

    -
    Regards
    Yuting

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 6, '10 at 4:04a
activeJul 6, '10 at 6:13p
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase