Grokbase Groups Pig user January 2011
FAQ
Alex,

It's a hack (sort of) but here's how I always do it. Since parsing json
in java will put you in an insane asylum:

Write a map only wukong script that parses the json as you want it. See
the example here:

http://thedatachef.blogspot.com/2011/01/processing-json-records-with-hadoop-and.html

then use the STREAM operator to stream your raw records (load them as
chararrays first) through your wukong script. It's not perfect but it
gets the job done.

--jacob
@thedatachef

On Sat, 2011-01-29 at 12:12 +0000, Alex McLintock wrote:
I wonder if discussion of the Piggybank and other User Defined Fields is
best done here (since it is *using* Pig) or on the Development list (because
it is enhancing Pig).

I'm trying to load some Json into pig using the PigJsonLoader.java UDF which
Kim Vogt posted about back in September. (It isn't in Piggybank AFAICS)
https://gist.github.com/601331


The class works for me - mostly....


This works when the Json is just a single level

{"field1": "value1", "field2": "value2", "field3": "value3"}

But doesn't seem to work when the json is nested

{"field1": "value1", "field2": "value2", {"field4": "value4", "field5":
"value5", "field6": "value6"}, "field3": "value3"}

Has anyone got this working? I can't see how the existing code deals with
this.
parseStringToTuple only creates a single Map. There is no recursion I can
see.



Any suggestions?

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 5 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedJan 29, '11 at 12:13p
activeJan 30, '11 at 10:24p
posts5
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase