Grokbase Groups Pig user October 2010
FAQ
Hi,
I met a headache about using UDFs with many dependence, adding them using
register command is very painful and not extensible. I can make
self-contained jar for hadoop job using maven (a jar with a lib directory
which contains all jars will be used for class look up), but it seems
doesn't work for pig. pig just treat that jar as a regular jar and try to
find classes directly inside it instead of inside those embedded jars.
Is there a way to make pig do the hadoop way of looking into the
self-contained big jar for class loading?
Thanks!

--
Regards,

Yong-gang Cao
Seattle,WA,98104

Search Discussions

  • Kaluskar, Sanjay at Oct 22, 2010 at 12:54 am
    I wrestled with this issue too, and I tried out a few things including
    the creating a single top-level jar (containing jars as well as
    containing the expanded files). As you found out, the jar with jars
    approach doesn't work. The jar with expanded jars approach could work if
    you don't have conflicting file names (incl resources) in the
    dependencies you are trying to package. It didn't work for me. The
    solution that I have isn't not very nice but it works: I have a
    top-level jar that included all the dependencies in its manifest (attr
    Class-Path). The maven assembly plugin can be used to automate this to
    make it extensible & less error-prone. Unfortunately, PIG will not add
    all the dependencies to the class path, so you will have to add this
    class to the class path by directly editing mapred-site.xml (using dist
    cache).

    -sanjay

    -----Original Message-----
    From: Yong-gang Cao
    Sent: Friday, October 22, 2010 5:39 AM
    To: pig-user@hadoop.apache.org
    Subject: is it possible to use self-contained jar in pig?

    Hi,
    I met a headache about using UDFs with many dependence, adding them
    using register command is very painful and not extensible. I can make
    self-contained jar for hadoop job using maven (a jar with a lib
    directory which contains all jars will be used for class look up), but
    it seems doesn't work for pig. pig just treat that jar as a regular jar
    and try to find classes directly inside it instead of inside those
    embedded jars.
    Is there a way to make pig do the hadoop way of looking into the
    self-contained big jar for class loading?
    Thanks!

    --
    Regards,

    Yong-gang Cao
    Seattle,WA,98104
  • Dave Wellman at Oct 22, 2010 at 3:28 pm
    have had some success creating and registering one single jar with dependencies. downside - the jar can be quite big. If you use maven to build your jar files just the following to the plugin section of your pom.xml


    <plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <configuration>
    <descriptorRefs>
    <descriptorRef>jar-with-dependencies</descriptorRef>
    </descriptorRefs>
    </configuration>
    <executions>
    <execution>
    <id>make-assembly</id>
    <phase>package</phase>
    <goals>
    <goal>attached</goal>
    </goals>
    </execution>
    </executions>
    </plugin>


    On Oct 21, 2010, at 6:08 PM, Yong-gang Cao wrote:

    Hi,
    I met a headache about using UDFs with many dependence, adding them using
    register command is very painful and not extensible. I can make
    self-contained jar for hadoop job using maven (a jar with a lib directory
    which contains all jars will be used for class look up), but it seems
    doesn't work for pig. pig just treat that jar as a regular jar and try to
    find classes directly inside it instead of inside those embedded jars.
    Is there a way to make pig do the hadoop way of looking into the
    self-contained big jar for class loading?
    Thanks!

    --
    Regards,

    Yong-gang Cao
    Seattle,WA,98104

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedOct 22, '10 at 12:09a
activeOct 22, '10 at 3:28p
posts3
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase