Grokbase Groups Pig user August 2010
FAQ
Hi all,

I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I
couldn't find anyone able to help me.
I'm trying to get Pig to read data from a Cassandra cluster, which I thought
trivial since Cassandra already provides me with the CassandraStorage class
[1]. Problem is that once I try executing a simple script like this:

register /path/to/pig-0.7.0-core.jar;register /path/to/libthrift-r917130.jar;
register /path/to/cassandra_loadfunc.jarrows = LOAD
'cassandra://Keyspace1/Standard1' USING
org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows
GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups =
GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE
COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames =
LIMIT orderednames 50;dump topnames;

I just end up with a NoClassDefFoundError:

ERROR org.apache.pig.tools.grunt.Grunt -
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias topnames
at org.apache.pig.PigServer.openIterator(PigServer.java:521)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:391)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
Unable to store alias topnames
at org.apache.pig.PigServer.store(PigServer.java:577)
at org.apache.pig.PigServer.openIterator(PigServer.java:504)
... 6 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
Unexpected error when launching map reduce job.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
at org.apache.pig.PigServer.store(PigServer.java:569)
... 7 more
Caused by: java.lang.RuntimeException: Could not resolve error that occured
when launching map reduce job: java.lang.NoClassDefFoundError:
org/apache/thrift/TBase
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
at java.lang.Thread.dispatchUncaughtException(Thread.java:1845)

(sorry for posting all the error message).
I cannot think of a reason as to why. As far as I understood it Pig takes
the jar files in the script, unpackages them, creates the execution plan for
the script itself and then bundles it into a single jar again, then submits
it to the HDFS from where it will be executed in Hadoop, right?
I also checked that the class in question actually is in the libthrift jar,
so what's going wrong?

Regards,
Chris

[1]
http://svn.apache.org/viewvc/cassandra/trunk/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?revision=984904&view=markup

Search Discussions

  • Bill Graham at Aug 14, 2010 at 12:44 am
    I've seen that exception in other cases where there is an unmeet
    dependency on a superclass that is included in a separate (and not
    provided) jar. Check the thrift source to see if that's the case.
    On Friday, August 13, 2010, Christian Decker wrote:
    Hi all,

    I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I
    couldn't find anyone able to help me.
    I'm trying to get Pig to read data from a Cassandra cluster, which I thought
    trivial since Cassandra already provides me with the CassandraStorage class
    [1]. Problem is that once I try executing a simple script like this:

    register /path/to/pig-0.7.0-core.jar;register /path/to/libthrift-r917130.jar;
    register /path/to/cassandra_loadfunc.jarrows = LOAD
    'cassandra://Keyspace1/Standard1' USING
    org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows
    GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups =
    GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE
    COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames =
    LIMIT orderednames 50;dump topnames;

    I just end up with a NoClassDefFoundError:

    ERROR org.apache.pig.tools.grunt.Grunt -
    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
    open iterator for alias topnames
    at org.apache.pig.PigServer.openIterator(PigServer.java:521)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
    at
    org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
    at org.apache.pig.Main.main(Main.java:391)
    Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
    Unable to store alias topnames
    at org.apache.pig.PigServer.store(PigServer.java:577)
    at org.apache.pig.PigServer.openIterator(PigServer.java:504)
    ... 6 more
    Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
    Unexpected error when launching map reduce job.
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
    at
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
    at org.apache.pig.PigServer.store(PigServer.java:569)
    ... 7 more
    Caused by: java.lang.RuntimeException: Could not resolve error that occured
    when launching map reduce job: java.lang.NoClassDefFoundError:
    org/apache/thrift/TBase
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1845)

    (sorry for posting all the error message).
    I cannot think of a reason as to why. As far as I understood it Pig takes
    the jar files in the script, unpackages them, creates the execution plan for
    the script itself and then bundles it into a single jar again, then submits
    it to the HDFS from where it will be executed in Hadoop, right?
    I also checked that the class in question actually is in the libthrift jar,
    so what's going wrong?

    Regards,
    Chris

    [1]
    http://svn.apache.org/viewvc/cassandra/trunk/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?revision=984904&view=markup
  • Christian Decker at Aug 17, 2010 at 3:25 pm
    I've got a partial response on the Cassandra mailing list:
    http://www.mail-archive.com/user@cassandra.apache.org/msg05216.html but
    still now it crashes on the Hadoop side, so not quite there but getting
    somewhere :-)
    On Sat, Aug 14, 2010 at 2:43 AM, Bill Graham wrote:

    I've seen that exception in other cases where there is an unmeet
    dependency on a superclass that is included in a separate (and not
    provided) jar. Check the thrift source to see if that's the case.
    On Friday, August 13, 2010, Christian Decker wrote:
    Hi all,

    I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I
    couldn't find anyone able to help me.
    I'm trying to get Pig to read data from a Cassandra cluster, which I thought
    trivial since Cassandra already provides me with the CassandraStorage class
    [1]. Problem is that once I try executing a simple script like this:

    register /path/to/pig-0.7.0-core.jar;register
    /path/to/libthrift-r917130.jar;
    register /path/to/cassandra_loadfunc.jarrows = LOAD
    'cassandra://Keyspace1/Standard1' USING
    org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows
    GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups =
    GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE
    COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames =
    LIMIT orderednames 50;dump topnames;

    I just end up with a NoClassDefFoundError:

    ERROR org.apache.pig.tools.grunt.Grunt -
    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
    open iterator for alias topnames
    at org.apache.pig.PigServer.openIterator(PigServer.java:521)
    at
    org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
    at
    org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
    at org.apache.pig.Main.main(Main.java:391)
    Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
    Unable to store alias topnames
    at org.apache.pig.PigServer.store(PigServer.java:577)
    at org.apache.pig.PigServer.openIterator(PigServer.java:504)
    ... 6 more
    Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
    Unexpected error when launching map reduce job.
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
    at
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
    at
    org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
    at org.apache.pig.PigServer.store(PigServer.java:569)
    ... 7 more
    Caused by: java.lang.RuntimeException: Could not resolve error that occured
    when launching map reduce job: java.lang.NoClassDefFoundError:
    org/apache/thrift/TBase
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1845)

    (sorry for posting all the error message).
    I cannot think of a reason as to why. As far as I understood it Pig takes
    the jar files in the script, unpackages them, creates the execution plan for
    the script itself and then bundles it into a single jar again, then submits
    it to the HDFS from where it will be executed in Hadoop, right?
    I also checked that the class in question actually is in the libthrift jar,
    so what's going wrong?

    Regards,
    Chris

    [1]
    http://svn.apache.org/viewvc/cassandra/trunk/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?revision=984904&view=markup
  • Ifeanyichukwu Osuji at Aug 16, 2010 at 5:34 pm
    Bill Graham wrote:
    I've seen that exception in other cases where there is an unmeet
    dependency on a superclass that is included in a separate (and not
    provided) jar. Check the thrift source to see if that's the case.
    On Friday, August 13, 2010, Christian Decker wrote:
    Hi all,

    I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I
    couldn't find anyone able to help me.
    I'm trying to get Pig to read data from a Cassandra cluster, which I thought
    trivial since Cassandra already provides me with the CassandraStorage class
    [1]. Problem is that once I try executing a simple script like this:

    register /path/to/pig-0.7.0-core.jar;register /path/to/libthrift-r917130.jar;
    register /path/to/cassandra_loadfunc.jarrows = LOAD
    'cassandra://Keyspace1/Standard1' USING
    org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows
    GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups =
    GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE
    COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames =
    LIMIT orderednames 50;dump topnames;

    I just end up with a NoClassDefFoundError:

    ERROR org.apache.pig.tools.grunt.Grunt -
    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
    open iterator for alias topnames
    at org.apache.pig.PigServer.openIterator(PigServer.java:521)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
    at
    org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
    at org.apache.pig.Main.main(Main.java:391)
    Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
    Unable to store alias topnames
    at org.apache.pig.PigServer.store(PigServer.java:577)
    at org.apache.pig.PigServer.openIterator(PigServer.java:504)
    ... 6 more
    Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
    Unexpected error when launching map reduce job.
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
    at
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
    at org.apache.pig.PigServer.store(PigServer.java:569)
    ... 7 more
    Caused by: java.lang.RuntimeException: Could not resolve error that occured
    when launching map reduce job: java.lang.NoClassDefFoundError:
    org/apache/thrift/TBase
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1845)

    (sorry for posting all the error message).
    I cannot think of a reason as to why. As far as I understood it Pig takes
    the jar files in the script, unpackages them, creates the execution plan for
    the script itself and then bundles it into a single jar again, then submits
    it to the HDFS from where it will be executed in Hadoop, right?
    I also checked that the class in question actually is in the libthrift jar,
    so what's going wrong?

    Regards,
    Chris

    [1]
    http://svn.apache.org/viewvc/cassandra/trunk/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?revision=984904&view=markup
  • Ifeanyichukwu Osuji at Aug 16, 2010 at 5:34 pm
    Bill Graham wrote:
    I've seen that exception in other cases where there is an unmeet
    dependency on a superclass that is included in a separate (and not
    provided) jar. Check the thrift source to see if that's the case.
    On Friday, August 13, 2010, Christian Decker wrote:
    Hi all,

    I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I
    couldn't find anyone able to help me.
    I'm trying to get Pig to read data from a Cassandra cluster, which I thought
    trivial since Cassandra already provides me with the CassandraStorage class
    [1]. Problem is that once I try executing a simple script like this:

    register /path/to/pig-0.7.0-core.jar;register /path/to/libthrift-r917130.jar;
    register /path/to/cassandra_loadfunc.jarrows = LOAD
    'cassandra://Keyspace1/Standard1' USING
    org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows
    GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups =
    GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE
    COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames =
    LIMIT orderednames 50;dump topnames;

    I just end up with a NoClassDefFoundError:

    ERROR org.apache.pig.tools.grunt.Grunt -
    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
    open iterator for alias topnames
    at org.apache.pig.PigServer.openIterator(PigServer.java:521)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
    at
    org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
    at
    org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
    at org.apache.pig.Main.main(Main.java:391)
    Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
    Unable to store alias topnames
    at org.apache.pig.PigServer.store(PigServer.java:577)
    at org.apache.pig.PigServer.openIterator(PigServer.java:504)
    ... 6 more
    Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
    Unexpected error when launching map reduce job.
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
    at
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
    at org.apache.pig.PigServer.store(PigServer.java:569)
    ... 7 more
    Caused by: java.lang.RuntimeException: Could not resolve error that occured
    when launching map reduce job: java.lang.NoClassDefFoundError:
    org/apache/thrift/TBase
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1845)

    (sorry for posting all the error message).
    I cannot think of a reason as to why. As far as I understood it Pig takes
    the jar files in the script, unpackages them, creates the execution plan for
    the script itself and then bundles it into a single jar again, then submits
    it to the HDFS from where it will be executed in Hadoop, right?
    I also checked that the class in question actually is in the libthrift jar,
    so what's going wrong?

    Regards,
    Chris

    [1]
    http://svn.apache.org/viewvc/cassandra/trunk/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?revision=984904&view=markup

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 13, '10 at 11:22a
activeAug 17, '10 at 3:25p
posts5
users3
websitepig.apache.org

People

Translate

site design / logo © 2022 Grokbase