FAQ
I've run into a small issue with my cluster deployed with Parcels. I was
running a package install before and did not run into the issue.

It seems that default location for LIBs are not "reversed linked" from the
parcel folder. i.e. The parcel version of sqoop does not use the default
lib folder.

Ex :
The Sqoop documentation indicates that libraries (in my case, Sql Server
JDBC drivers) should be put in /usr/lib/sqoop/lib/. This works fine using
the package installation... but using the parcel, it only works if I put
the JDBC file in the Parcel sub folder.. in my case :
/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib

Is this the expected behavior?

Should I put my 3rd party JARs elsewhere to avoid having to move them
around if I deploy a new parcel.

Thanks!

Search Discussions

  • Philip Langdale at Mar 1, 2013 at 7:02 pm
    Hi Phillippe,

    Ideally, you would keep your third party libraries in a separate stable
    location and set the classpath to include them, rather than dumping
    them into sqoop/lib, hadoop/lib, hive/lib, etc. I know that this is a very
    common practice but it doesn't work well with parcels, for reasons
    you've already recognised. Even if we set symlinks into /usr/lib, they
    would not help you when switching to a new parcel, as the symlinks
    would change but the files wouldn't move.

    I'm currently looking at the easiest way to use parcels themselves to
    achieve this - ideally you'd bundle your jars up into a parcel and let
    CM take care of managing the classpaths.

    --phil

    On 1 March 2013 23:58, Philippe Marseille wrote:

    I've run into a small issue with my cluster deployed with Parcels. I was
    running a package install before and did not run into the issue.

    It seems that default location for LIBs are not "reversed linked" from the
    parcel folder. i.e. The parcel version of sqoop does not use the default
    lib folder.

    Ex :
    The Sqoop documentation indicates that libraries (in my case, Sql Server
    JDBC drivers) should be put in /usr/lib/sqoop/lib/. This works fine using
    the package installation... but using the parcel, it only works if I put
    the JDBC file in the Parcel sub folder.. in my case :
    /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib

    Is this the expected behavior?

    Should I put my 3rd party JARs elsewhere to avoid having to move them
    around if I deploy a new parcel.

    Thanks!
  • James Hogarth at Mar 4, 2013 at 9:14 am

    I'm currently looking at the easiest way to use parcels themselves to
    achieve this - ideally you'd bundle your jars up into a parcel and let
    CM take care of managing the classpaths.

    Hi Phil,

    I'm just encountering this myself now post upgrade ... I'm guessing
    such functionality would require a new CM release and just manually manage
    for now?

    Cheers,

    James
  • Philip Langdale at Mar 4, 2013 at 9:04 pm
    Yeah. So, what the specific components where you deploy third party libs?
    Each one has a different way to handle them
    in a 'parcel transparent' way.

    For Sqoop, you would add the connector jar to $SQOOP_CLASSPATH before
    running it, and you could ensure that
    happens every time by adding the declaration to sqoop-env.sh.

    Roughly speaking, this approach works for all components, but some of them
    have more comprehensive mechanisms
    like -addjars for MR or Hive.

    My immediate term goal is to document these mechanisms so no one's lost as
    to what to do.

    --phil

    On 4 March 2013 01:14, James Hogarth wrote:


    I'm currently looking at the easiest way to use parcels themselves to
    achieve this - ideally you'd bundle your jars up into a parcel and let
    CM take care of managing the classpaths.

    Hi Phil,

    I'm just encountering this myself now post upgrade ... I'm guessing
    such functionality would require a new CM release and just manually manage
    for now?

    Cheers,

    James
  • Philippe Marseille at Mar 5, 2013 at 12:23 pm
    Nice,

    Looking forward to this documentation.

    Thanks.
    On Monday, 4 March 2013 16:04:31 UTC-5, Philip Langdale wrote:

    Yeah. So, what the specific components where you deploy third party libs?
    Each one has a different way to handle them
    in a 'parcel transparent' way.

    For Sqoop, you would add the connector jar to $SQOOP_CLASSPATH before
    running it, and you could ensure that
    happens every time by adding the declaration to sqoop-env.sh.

    Roughly speaking, this approach works for all components, but some of them
    have more comprehensive mechanisms
    like -addjars for MR or Hive.

    My immediate term goal is to document these mechanisms so no one's lost as
    to what to do.

    --phil


    On 4 March 2013 01:14, James Hogarth <[email protected] <javascript:>>wrote:
    I'm currently looking at the easiest way to use parcels themselves to
    achieve this - ideally you'd bundle your jars up into a parcel and let
    CM take care of managing the classpaths.

    Hi Phil,

    I'm just encountering this myself now post upgrade ... I'm guessing
    such functionality would require a new CM release and just manually manage
    for now?

    Cheers,

    James

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedMar 1, '13 at 3:58p
activeMar 5, '13 at 12:23p
posts5
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase