Grokbase Groups Pig user June 2011
FAQ
I'm currently trying to write a pig script to output a feature index. Is
there a built-in function for converting an unknown length tuple to
output once for each item in the tuple?

Example code:

raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS
json:chararray;
genmap = FOREACH raw GENERATE com.mozilla.pig.eval.json.JsonMap(json) AS
json_map:map[];
words = FOREACH genmap GENERATE
FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS word_tuple;
dump words;
(the,quick,brown,fox,jumped,over,the,lazy,dog)

I want to get:

the
quick
brown
fox
jumped
over
lazy
dog

Thanks,

-Xavier

Search Discussions

  • Thejas M Nair at Jun 2, 2011 at 6:53 pm
    one_word_per_line = FOREACH words GENERATE FLATTEN(TOBAG(*));

    -Thejas


    On 6/2/11 11:38 AM, "Xavier Stevens" wrote:

    I'm currently trying to write a pig script to output a feature index. Is
    there a built-in function for converting an unknown length tuple to
    output once for each item in the tuple?

    Example code:

    raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS
    json:chararray;
    genmap = FOREACH raw GENERATE com.mozilla.pig.eval.json.JsonMap(json) AS
    json_map:map[];
    words = FOREACH genmap GENERATE
    FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS word_tuple;
    dump words;
    (the,quick,brown,fox,jumped,over,the,lazy,dog)

    I want to get:

    the
    quick
    brown
    fox
    jumped
    over
    lazy
    dog

    Thanks,

    -Xavier



    --
  • Xavier Stevens at Jun 2, 2011 at 6:58 pm
    Awesome! I was trying to FLATTEN(*) without the TOBAG.

    Thanks Thejas.
    On 6/2/11 11:52 AM, Thejas M Nair wrote:
    one_word_per_line = FOREACH words GENERATE FLATTEN(TOBAG(*));

    -Thejas


    On 6/2/11 11:38 AM, "Xavier Stevens" wrote:

    I'm currently trying to write a pig script to output a feature
    index. Is
    there a built-in function for converting an unknown length tuple to
    output once for each item in the tuple?

    Example code:

    raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS
    json:chararray;
    genmap = FOREACH raw GENERATE
    com.mozilla.pig.eval.json.JsonMap(json) AS
    json_map:map[];
    words = FOREACH genmap GENERATE
    FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS
    word_tuple;
    dump words;
    (the,quick,brown,fox,jumped,over,the,lazy,dog)

    I want to get:

    the
    quick
    brown
    fox
    jumped
    over
    lazy
    dog

    Thanks,

    -Xavier



    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJun 2, '11 at 6:39p
activeJun 2, '11 at 6:58p
posts3
users2
websitepig.apache.org

2 users in discussion

Xavier Stevens: 2 posts Thejas M Nair: 1 post

People

Translate

site design / logo © 2021 Grokbase