Scott Carey commented on PIG-1540:
From the email thread:
You are mostly correct. All of those jars are not required to be in
there in pig-withouthadoop.jar. I see no reason why junit needs to be
there. Jackson and Joda are piggybank dependencies and as such should
be included in piggybank.jar not in pig-withouthadoop.jar. No idea
from where hamcrest and jshell are getting included. Looks like they
should be removed as well. I think even jline can be removed since its
only required at client side where users will be either using pig.jar
(which contains everything in any case) or setting up there own
classpath to use pig-withouthadoop.jar. So, it seems all the jars you
pointed out can be removed from pig-withouthadoop and that will lower
the distribution cost of it to all tasktracker node.
Lets open a jira and continue the discussion over there. Scott, would
you mind opening one?
On Sun, Aug 8, 2010 at 12:41, Scott Carey wrote:
That ant target is still a problem.
It may have removed most hadoop jars, but still has useless dependencies. Why is junit in there? Why is jackson in there? I don't see why I need to push Junit out to the cluster with each submitted job. I don't see where Pig is using JSON form Jackson.
The latter makes it impossible to use Pig with Avro unless you order the classpath right or build a custom jar.
Are hamcrest and jshell used?
I get the jline, and joda inclusions, but even then those should probably be external jars on the classpath from a lib directory.
Setting Pig up with a proper maven POM or ivy configuration would be a big plus to those consuming Pig.
clean up pig dependencies included in jar files
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Scott Carey
Pig's output jars are difficult to include in other projects and bloated. Building some jar targets for common use cases would be a big benefit to those consuming pig. As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
More information in comments to follow.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.