FAQ
Hi all,

I have a a simple schema that I want to store as JSON. So I've written a
simple JsonStorage class but it requires that the tuple's first field is a
map. The problem is in converting a regular tuple into a map:

DESCRIBE thing;
thing: {id: chararray,field1: chararray,field2: chararray}
What the map/JSON should look like:
{ 'id': 'id0', 'foo': 'valueFromField1', 'bar': 'valueFromField2' }

So this should work but seems to be invalid syntax:
jsonStore = FOREACH thing GENERATE
[ 'id'#id, 'foo'#field1, 'bar'#field2 ] AS json:map[];

ERROR 1000: Error during parsing. Encountered " "[" "[ "" at line 150,
column 23.
Was expecting one of:
"flatten" ...
"(" ...
"-" ...
"(" ...
"(" ...
"(" ...
"(" ...
"(" ...

The only way I have this syntax working is if I use only constants in the
map:
jsonStore = FOREACH thing GENERATE
[ 'id'#'const', 'foo'#'const', 'bar'#'const' ] AS json:map[];

Is it possible to do what I'm thinking?

Thanks,

Josh

Search Discussions

  • Dmitriy Ryaboy at Nov 26, 2010 at 10:39 pm
    I don't think we've considered building out Maps in Pig this way. You can of
    course run your data through a UDF that would take a tuple whose first
    argument is a list of key names, and invoke it like so:

    jsonStore = FOREACH thing GENERATE
    toMap('id foo bar', *) AS json:map[];

    -D
    On Thu, Nov 25, 2010 at 11:53 AM, Josh Devins wrote:

    Hi all,

    I have a a simple schema that I want to store as JSON. So I've written a
    simple JsonStorage class but it requires that the tuple's first field is a
    map. The problem is in converting a regular tuple into a map:

    DESCRIBE thing;
    thing: {id: chararray,field1: chararray,field2: chararray}
    What the map/JSON should look like:
    { 'id': 'id0', 'foo': 'valueFromField1', 'bar': 'valueFromField2' }

    So this should work but seems to be invalid syntax:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#id, 'foo'#field1, 'bar'#field2 ] AS json:map[];

    ERROR 1000: Error during parsing. Encountered " "[" "[ "" at line 150,
    column 23.
    Was expecting one of:
    "flatten" ...
    "(" ...
    "-" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...

    The only way I have this syntax working is if I use only constants in the
    map:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#'const', 'foo'#'const', 'bar'#'const' ] AS json:map[];

    Is it possible to do what I'm thinking?

    Thanks,

    Josh
  • Josh Devins at Nov 27, 2010 at 9:26 am
    Thanks Dmitriy, I just needed a sanity check! I've essentially done
    the same thing as you describe, create a UDF to do the conversion but
    of course it would be nice to not have to do that. I assume that other
    people (like you and the other Twitter folks) are then working with
    JSON in Pig by reading in JSON in the first place and never building
    it in Pig as you go?

    I think building Maps would be a nice language feature so I'll log it
    as an issue.

    Cheers,

    Josh


    On 2010-11-26, at 11:39 PM, Dmitriy Ryaboy wrote:

    I don't think we've considered building out Maps in Pig this way. You can of
    course run your data through a UDF that would take a tuple whose first
    argument is a list of key names, and invoke it like so:

    jsonStore = FOREACH thing GENERATE
    toMap('id foo bar', *) AS json:map[];

    -D
    On Thu, Nov 25, 2010 at 11:53 AM, Josh Devins wrote:

    Hi all,

    I have a a simple schema that I want to store as JSON. So I've written a
    simple JsonStorage class but it requires that the tuple's first field is a
    map. The problem is in converting a regular tuple into a map:

    DESCRIBE thing;
    thing: {id: chararray,field1: chararray,field2: chararray}
    What the map/JSON should look like:
    { 'id': 'id0', 'foo': 'valueFromField1', 'bar': 'valueFromField2' }

    So this should work but seems to be invalid syntax:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#id, 'foo'#field1, 'bar'#field2 ] AS json:map[];

    ERROR 1000: Error during parsing. Encountered " "[" "[ "" at line 150,
    column 23.
    Was expecting one of:
    "flatten" ...
    "(" ...
    "-" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...

    The only way I have this syntax working is if I use only constants in the
    map:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#'const', 'foo'#'const', 'bar'#'const' ] AS json:map[];

    Is it possible to do what I'm thinking?

    Thanks,

    Josh
  • Dmitriy Ryaboy at Nov 27, 2010 at 10:37 pm
    Yep, we mostly deal with JSON only as an input format, and the first thing
    we do is flatten it out.
    Working with maps is cumbersome in Pig due to the casting issues, so I
    prefer to avoid that when possible.
    -D
    On Sat, Nov 27, 2010 at 1:26 AM, Josh Devins wrote:

    Thanks Dmitriy, I just needed a sanity check! I've essentially done
    the same thing as you describe, create a UDF to do the conversion but
    of course it would be nice to not have to do that. I assume that other
    people (like you and the other Twitter folks) are then working with
    JSON in Pig by reading in JSON in the first place and never building
    it in Pig as you go?

    I think building Maps would be a nice language feature so I'll log it
    as an issue.

    Cheers,

    Josh


    On 2010-11-26, at 11:39 PM, Dmitriy Ryaboy wrote:

    I don't think we've considered building out Maps in Pig this way. You can of
    course run your data through a UDF that would take a tuple whose first
    argument is a list of key names, and invoke it like so:

    jsonStore = FOREACH thing GENERATE
    toMap('id foo bar', *) AS json:map[];

    -D
    On Thu, Nov 25, 2010 at 11:53 AM, Josh Devins wrote:

    Hi all,

    I have a a simple schema that I want to store as JSON. So I've written a
    simple JsonStorage class but it requires that the tuple's first field is
    a
    map. The problem is in converting a regular tuple into a map:

    DESCRIBE thing;
    thing: {id: chararray,field1: chararray,field2: chararray}
    What the map/JSON should look like:
    { 'id': 'id0', 'foo': 'valueFromField1', 'bar': 'valueFromField2' }

    So this should work but seems to be invalid syntax:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#id, 'foo'#field1, 'bar'#field2 ] AS json:map[];

    ERROR 1000: Error during parsing. Encountered " "[" "[ "" at line 150,
    column 23.
    Was expecting one of:
    "flatten" ...
    "(" ...
    "-" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...
    "(" ...

    The only way I have this syntax working is if I use only constants in
    the
    map:
    jsonStore = FOREACH thing GENERATE
    [ 'id'#'const', 'foo'#'const', 'bar'#'const' ] AS json:map[];

    Is it possible to do what I'm thinking?

    Thanks,

    Josh

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedNov 25, '10 at 7:53p
activeNov 27, '10 at 10:37p
posts4
users2
websitepig.apache.org

2 users in discussion

Dmitriy Ryaboy: 2 posts Josh Devins: 2 posts

People

Translate

site design / logo © 2021 Grokbase