Grokbase Groups Pig dev August 2010
FAQ
clean up pig dependencies included in jar files
-----------------------------------------------

Key: PIG-1540
URL: https://issues.apache.org/jira/browse/PIG-1540
Project: Pig
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Scott Carey


Pig's output jars are difficult to include in other projects and bloated. Building some jar targets for common use cases would be a big benefit to those consuming pig. As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.

More information in comments to follow.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Scott Carey (JIRA) at Aug 9, 2010 at 5:39 pm
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896636#action_12896636 ]

    Scott Carey commented on PIG-1540:
    ----------------------------------
    From the email thread:
    {quote}
    Scott,

    You are mostly correct. All of those jars are not required to be in
    there in pig-withouthadoop.jar. I see no reason why junit needs to be
    there. Jackson and Joda are piggybank dependencies and as such should
    be included in piggybank.jar not in pig-withouthadoop.jar. No idea
    from where hamcrest and jshell are getting included. Looks like they
    should be removed as well. I think even jline can be removed since its
    only required at client side where users will be either using pig.jar
    (which contains everything in any case) or setting up there own
    classpath to use pig-withouthadoop.jar. So, it seems all the jars you
    pointed out can be removed from pig-withouthadoop and that will lower
    the distribution cost of it to all tasktracker node.

    Lets open a jira and continue the discussion over there. Scott, would
    you mind opening one?

    Ashutosh

    On Sun, Aug 8, 2010 at 12:41, Scott Carey wrote:
    That ant target is still a problem.

    It may have removed most hadoop jars, but still has useless dependencies. Why is junit in there? Why is jackson in there? I don't see why I need to push Junit out to the cluster with each submitted job. I don't see where Pig is using JSON form Jackson.
    The latter makes it impossible to use Pig with Avro unless you order the classpath right or build a custom jar.


    Are hamcrest and jshell used?

    I get the jline, and joda inclusions, but even then those should probably be external jars on the classpath from a lib directory.

    Setting Pig up with a proper maven POM or ivy configuration would be a big plus to those consuming Pig.
    {quote}

    clean up pig dependencies included in jar files
    -----------------------------------------------

    Key: PIG-1540
    URL: https://issues.apache.org/jira/browse/PIG-1540
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.7.0
    Reporter: Scott Carey

    Pig's output jars are difficult to include in other projects and bloated. Building some jar targets for common use cases would be a big benefit to those consuming pig. As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
    More information in comments to follow.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Daniel Dai (JIRA) at Aug 9, 2010 at 5:58 pm
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896644#action_12896644 ]

    Daniel Dai commented on PIG-1540:
    ---------------------------------

    Pig also produce build/pig-0.8.0-dev-core.jar, which include Pig only classes. When Pig is ready to publish to Maven (PIG-1334), we will only publish pig-0.8.0-dev-core.jar.
    clean up pig dependencies included in jar files
    -----------------------------------------------

    Key: PIG-1540
    URL: https://issues.apache.org/jira/browse/PIG-1540
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.7.0
    Reporter: Scott Carey

    Pig's output jars are difficult to include in other projects and bloated. Building some jar targets for common use cases would be a big benefit to those consuming pig. As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
    More information in comments to follow.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Scott Carey (JIRA) at Aug 9, 2010 at 6:23 pm
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896653#action_12896653 ]

    Scott Carey commented on PIG-1540:
    ----------------------------------

    Here are some common use cases for pig's jar files:


    Build and test pig as a developer, stand-alone: pig.jar
    Use pig for pig development and debugging: pig.jar
    Run pig scripts in production, trimmed down dependencies: pig-withouthadoop.jar
    * Perhaps this should be trimmed down somewhat, with just pig, piggybank, and other 'basics' that would commonly need to be used when running a pig script.

    Include pig in a project (for example, a custom LoadFunc project): ????
    * Here is where a maven-compatible jar, javadoc-jar, and source-jar would be a blessing. PIG-1334 should address the javadoc and source jars as well.

    clean up pig dependencies included in jar files
    -----------------------------------------------

    Key: PIG-1540
    URL: https://issues.apache.org/jira/browse/PIG-1540
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.7.0
    Reporter: Scott Carey

    Pig's output jars are difficult to include in other projects and bloated. Building some jar targets for common use cases would be a big benefit to those consuming pig. As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
    More information in comments to follow.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 9, '10 at 5:39p
activeAug 9, '10 at 6:23p
posts4
users1
websitepig.apache.org

1 user in discussion

Scott Carey (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase