Grokbase Groups Pig user July 2010
FAQ
It seems the Pig 0.7.0 JAR contains Jetty classes. It's causing some
classloader problems for a webapp of mine that happens to include the
Pig JAR. Is there some reason why this has to be this way? If not they
should probably be removed.


-Xavier

Search Discussions

  • Scott Carey at Jul 31, 2010 at 4:30 pm
    It has about 10x. The jars necessary in it including hadoop and all its dependencies. I had to build a sanitized pig jar the build.xml has some targets with reduced output.

    ----- Reply message -----
    From: "Xavier Stevens" <xstevens@mozilla.com>
    Date: Fri, Jul 30, 2010 9:30 am
    Subject: Removing Jetty classes from Pig JAR
    To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>

    It seems the Pig 0.7.0 JAR contains Jetty classes. It's causing some
    classloader problems for a webapp of mine that happens to include the
    Pig JAR. Is there some reason why this has to be this way? If not they
    should probably be removed.


    -Xavier
  • Ashutosh Chauhan at Jul 31, 2010 at 8:21 pm
    Xavier,

    There is an ant target pig-withouthadoop which generates
    pig-withouthadoop.jar which contains minimal classes to run Pig and
    has none of dependencies in it. It's 5.4M compared to 13M of pig.jar
    You may want to try that.

    The default target builds pig.jar since we dont want our new users to
    deal with classpath issues when they are just starting off and thus
    build a self-contained jar for them.

    Ashutosh
    On Sat, Jul 31, 2010 at 09:30, Scott Carey wrote:
    It has about 10x. The jars necessary in it including hadoop and all its dependencies.   I had to build a sanitized pig jar the build.xml has some targets with reduced output.

    ----- Reply message -----
    From: "Xavier Stevens" <xstevens@mozilla.com>
    Date: Fri, Jul 30, 2010 9:30 am
    Subject: Removing Jetty classes from Pig JAR
    To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>

    It seems the Pig 0.7.0 JAR contains Jetty classes.  It's causing some
    classloader problems for a webapp of mine that happens to include the
    Pig JAR.  Is there some reason why this has to be this way?  If not they
    should probably be removed.


    -Xavier
  • Xavier Stevens at Aug 2, 2010 at 3:33 pm
    That makes sense. Thanks Ashutosh!

    On 7/31/10 1:21 PM, Ashutosh Chauhan wrote:
    Xavier,

    There is an ant target pig-withouthadoop which generates
    pig-withouthadoop.jar which contains minimal classes to run Pig and
    has none of dependencies in it. It's 5.4M compared to 13M of pig.jar
    You may want to try that.

    The default target builds pig.jar since we dont want our new users to
    deal with classpath issues when they are just starting off and thus
    build a self-contained jar for them.

    Ashutosh
    On Sat, Jul 31, 2010 at 09:30, Scott Carey wrote:
    It has about 10x. The jars necessary in it including hadoop and all its dependencies. I had to build a sanitized pig jar the build.xml has some targets with reduced output.

    ----- Reply message -----
    From: "Xavier Stevens" <xstevens@mozilla.com>
    Date: Fri, Jul 30, 2010 9:30 am
    Subject: Removing Jetty classes from Pig JAR
    To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>

    It seems the Pig 0.7.0 JAR contains Jetty classes. It's causing some
    classloader problems for a webapp of mine that happens to include the
    Pig JAR. Is there some reason why this has to be this way? If not they
    should probably be removed.


    -Xavier
  • Scott Carey at Aug 8, 2010 at 7:41 pm
    That ant target is still a problem.

    It may have removed most hadoop jars, but still has useless dependencies. Why is junit in there? Why is jackson in there? I don't see why I need to push Junit out to the cluster with each submitted job. I don't see where Pig is using JSON form Jackson.
    The latter makes it impossible to use Pig with Avro unless you order the classpath right or build a custom jar.


    Are hamcrest and jshell used?

    I get the jline, and joda inclusions, but even then those should probably be external jars on the classpath from a lib directory.

    Setting Pig up with a proper maven POM or ivy configuration would be a big plus to those consuming Pig.

    On Jul 31, 2010, at 1:21 PM, Ashutosh Chauhan wrote:

    Xavier,

    There is an ant target pig-withouthadoop which generates
    pig-withouthadoop.jar which contains minimal classes to run Pig and
    has none of dependencies in it. It's 5.4M compared to 13M of pig.jar
    You may want to try that.

    The default target builds pig.jar since we dont want our new users to
    deal with classpath issues when they are just starting off and thus
    build a self-contained jar for them.

    Ashutosh
    On Sat, Jul 31, 2010 at 09:30, Scott Carey wrote:
    It has about 10x. The jars necessary in it including hadoop and all its dependencies. I had to build a sanitized pig jar the build.xml has some targets with reduced output.

    ----- Reply message -----
    From: "Xavier Stevens" <xstevens@mozilla.com>
    Date: Fri, Jul 30, 2010 9:30 am
    Subject: Removing Jetty classes from Pig JAR
    To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>

    It seems the Pig 0.7.0 JAR contains Jetty classes. It's causing some
    classloader problems for a webapp of mine that happens to include the
    Pig JAR. Is there some reason why this has to be this way? If not they
    should probably be removed.


    -Xavier
  • Ashutosh Chauhan at Aug 9, 2010 at 12:50 am
    Scott,

    You are mostly correct. All of those jars are not required to be in
    there in pig-withouthadoop.jar. I see no reason why junit needs to be
    there. Jackson and Joda are piggybank dependencies and as such should
    be included in piggybank.jar not in pig-withouthadoop.jar. No idea
    from where hamcrest and jshell are getting included. Looks like they
    should be removed as well. I think even jline can be removed since its
    only required at client side where users will be either using pig.jar
    (which contains everything in any case) or setting up there own
    classpath to use pig-withouthadoop.jar. So, it seems all the jars you
    pointed out can be removed from pig-withouthadoop and that will lower
    the distribution cost of it to all tasktracker node.

    Lets open a jira and continue the discussion over there. Scott, would
    you mind opening one?

    Ashutosh
    On Sun, Aug 8, 2010 at 12:41, Scott Carey wrote:
    That ant target is still a problem.

    It may have removed most hadoop jars, but still has useless dependencies.   Why is junit in there?  Why is jackson in there?  I don't see why I need to push Junit out to the cluster with each submitted job.  I don't see where Pig is using JSON form Jackson.
    The latter makes it impossible to use Pig with Avro unless you order the classpath right or build a custom jar.


    Are hamcrest and jshell used?

    I get the jline, and joda inclusions, but even then those should probably be external jars on the classpath from a lib directory.

    Setting Pig up with a proper maven POM or ivy configuration would be a big plus to those consuming Pig.

    On Jul 31, 2010, at 1:21 PM, Ashutosh Chauhan wrote:

    Xavier,

    There is an ant target pig-withouthadoop which generates
    pig-withouthadoop.jar which contains minimal classes to run Pig and
    has none of dependencies in it. It's 5.4M compared to 13M of pig.jar
    You may want to try that.

    The default target builds pig.jar since we dont want our new users to
    deal with classpath issues when they are just starting off and thus
    build a self-contained jar for them.

    Ashutosh
    On Sat, Jul 31, 2010 at 09:30, Scott Carey wrote:
    It has about 10x. The jars necessary in it including hadoop and all its dependencies.   I had to build a sanitized pig jar the build.xml has some targets with reduced output.

    ----- Reply message -----
    From: "Xavier Stevens" <xstevens@mozilla.com>
    Date: Fri, Jul 30, 2010 9:30 am
    Subject: Removing Jetty classes from Pig JAR
    To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>

    It seems the Pig 0.7.0 JAR contains Jetty classes.  It's causing some
    classloader problems for a webapp of mine that happens to include the
    Pig JAR.  Is there some reason why this has to be this way?  If not they
    should probably be removed.


    -Xavier

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 30, '10 at 4:30p
activeAug 9, '10 at 12:50a
posts6
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase