Grokbase Groups Pig user January 2011
FAQ
I have written a generalized load func for nested Json - but hit a wall.

Not sure how to access the nested data once in pig for something like the
following:

Original JSON:

{"body":[{"token":"foo2","hash":"-33333333333"},{"token":"bar2","hash":"-22222222222"}],"pmessgid":"559830","subject":[{"token":"fooo","hash":"111111"},{"token":"bar","hash":"999999"}],"userid":"77274","messageid":"559837","threadid":"104997"}


Dump of tuple.toString() in to system out from my LoadFunc (after generating
the tuple from a custom load func - a recursive json walking mechanism that
generates nested maps and tuples)

([body#([token#foo2,hash#-33333333333],[token#bar2,hash#-22222222222]),subject#([token#fooo,hash#111111],[token#bar,hash#999999]),userid#77274,messageid#559837,threadid#104997,pmessgid#559830])


So far so good, I can produce the right data structure in code, and when I
dump it via the toString() it looks good!

**** My problem ->

So here is the schema in the example above:
Map<String,Object> where Object is either a list of tuple of
Map<String,String>s OR just a String.


In my pig script, I can get this far:
A = LOAD '/jivepoc/jivecommunity/dbsqoop/usermessages-clean-features2' USING
com.proximal.pig.tools.JSONLoader() as (
json: map[]
);

If I don't qualify the map[] above, i can select an item from the map (say
'body') and it says:

certain_keys = FOREACH A GENERATE json#'body' AS b;
DESCRIBE certain_keys;
certain_keys: {b: bytearray}

Looks good, it is a bytearray if i don't further define what i have, but now
I'm stuck -> I need to load a much more detailed map[].
Problem is the map[] (as pointed out above) can contain either a String or
a Map<String,String>

There is no typecasting right? I'm I missing something, or am I stuck??

Thanks
Lance



Additional info:

Code to do this:

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedJan 15, '11 at 12:13a
activeJan 18, '11 at 8:00p
posts2
users2
websitepig.apache.org

2 users in discussion

Daniel Dai: 1 post Lance Riedel: 1 post

People

Translate

site design / logo © 2021 Grokbase