On 29 January 2011 13:43, Jacob Perkins wrote:
Write a map only wukong script that parses the json as you want it. See
the example here:
http://thedatachef.blogspot.com/2011/01/processing-json-records-with-hadoop-and.html
Write a map only wukong script that parses the json as you want it. See
the example here:
http://thedatachef.blogspot.com/2011/01/processing-json-records-with-hadoop-and.html
Thanks very much for helping me out. I haven't heard of Wukong before.
I am a bit concerned though by adding Ruby into my tool stack as well as
Pig. It seems like a step too far.
Presumably I have to distribute Ruby and Wukong across all my job nodes in
the same way as if I were writing perl or C++ streaming programs.
With STREAMing - the script is launched once per file, right, not once per
record?
Alex