FAQ
We are running EB in production with Pig 0.11 against CDH3.
Hadoop 2 is a different story -- lots of things need to change to have that
work. Raghu has a branch that makes EB changes:
https://github.com/rangadi/elephant-bird/tree/hadoop-2.0-support

On Thu, Apr 4, 2013 at 6:39 PM, Ruslan Al-Fakikh wrote:

Hi guys,

As for elephant-bird, it seems that it is not compatible with Pig 0.10
(CDH4) :(
I am using this configuration:
pig -version
Apache Pig version 0.10.0-cdh4.1.1 (rexported)
hadoop version
Hadoop 2.0.0-cdh4.1.1
and getting just the same error as Tim explained:
java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.Counter, but class was expected

I am running it with the following commands:
REGISTER elephant-bird-pig-3.0.2.jar;
inputData = LOAD 'sample_simple.json' USING
com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);
DUMP inputData;

On Thu, Sep 27, 2012 at 8:48 AM, Dmitriy Ryaboy wrote:

Yep. It's just JsonLoader.
By default it works on top of whatever's returned by TexInputFormat, but
you can override that, as long as the input format returns a string that's
valid json, we are cool (so in theory you could write a
TwitterAPIInputFormat or something, and get the json in Pig, not that I
would recommend that).

D

On Wed, Sep 26, 2012 at 9:34 PM, Russell Jurney <
russell.jurney@gmail.com
wrote:
Does that work without lzo?

Russell Jurney http://datasyndrome.com
On Sep 26, 2012, at 9:00 PM, Dmitriy Ryaboy wrote:

Try asking Michael May on gihub? This seems to be an issue with his Loader..
The JsonLoader in ElephantBird should work in this case if you turn
on
nested parsing (
https://github.com/kevinweil/elephant-bird/blob/master/pig/src/main/java/com/twitter/elephantbird/pig/load/JsonLoader.java
)

D

On Wed, Sep 26, 2012 at 2:31 PM, Deepak Tiwari <dtiwari356@gmail.com
wrote:
My bad.. I think I have compiled from
https://github.com/mmay/PigJsonLoader/blob/master/JsonLoader.javalong
time
back in my piggybank area..it indeed didnt come with the original
jar...
Regards,

Deepak

On Tue, Sep 25, 2012 at 8:14 AM, Bill Graham <billgraham@gmail.com>
wrote:
I missed the part about Piggybank, but I'm confused because I don't
see
that class in SVN:
http://svn.apache.org/viewvc/pig/branches/branch-0.10/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/
Either way your error seems to be issues with parsing the doubles.


On Mon, Sep 24, 2012 at 2:24 PM, Vivek Shrivastava <
vivshrivastava@gmail.com
wrote:
Thanks for responding Bill, However I am using JsonLoader that is
in
the
Piggybank with Pig-0.10.0.

It doesnt need any schema and converts Json data as map (
org.apache.pig.piggybank.storage.JsonLoader() as (json:map[]) )
and
I
extract data from there using keys. I have processed huge amount
of
data
without any problem and no schema was required.

Regards,

Vivek

On Mon, Sep 24, 2012 at 2:03 PM, Bill Graham <
billgraham@gmail.com>
wrote:
This loader only works for data stored using JsonStorage. From
the
javadocs:

A loader for data stored using
JsonStorage<
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonStorage.html
.
This is not a generic JSON loader. It depends on the schema being
stored
with the data when conceivably you could write a loader that
determines
the
schema from the JSON.

Was this data produced via JsonStorage? If not, you'll need to
write
a
custom loader.

On Mon, Sep 24, 2012 at 12:04 PM, Deepak Tiwari <
dtiwari356@gmail.com
wrote:
Hi,

I am try to parse this data using Pig parser
org.apache.pig.piggybank.storage.JsonLoader

{"geo":{"type":"Polygon","coordinates":[[[-91.3061478,-30.2688069],[-91.012471,-60.2688069],[-91.012471,-69.9306357],[-91.3061478,-29.9306357]]]},
I need to extract this array
[[[-91.3061478,-30.2688069],[-91.012471,-60.2688069],[-91.012471,-69.9306357],[-91.3061478,-29.9306357]]]
I am getting this error while accessing
flatten(geo#'coordinates')
, I
think that's the limitation ( "only standard Pig type is
supported")
of
the
the parser, but wondering if someone has any workaround

"java.lang.RuntimeException: Unexpected data type
org.codehaus.jackson.node.DoubleNode found in stream. Note only
standard
Pig type is supported when you output from UDF/LoadFunc"


Thanks very much,

Deepak


--
*Note that I'm no longer using my Yahoo! email address. Please
email
me
at
billgraham@gmail.com going forward.*

--
*Note that I'm no longer using my Yahoo! email address. Please
email
me
at
billgraham@gmail.com going forward.*

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 11 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedSep 24, '12 at 7:04p
activeApr 9, '13 at 10:53a
posts11
users6
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase