FAQ
Hello,

i installed hadoop-lzo (fork from toddlipcon) on CDH4. Everything works
great if i use the old mapred API and DeprecatedLzoTextInputFormat. After
changing to the new API (mapreduce) and LzoTextInputFormat, i get this
error:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
         at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
...

Any ideas what's wrong?
regards,
Henning

--

Search Discussions

  • DrScott at Jul 27, 2012 at 9:01 am
    just a small correction: i use this fork:
    https://github.com/cloudera/hadoop-lzo.git

    Still looking for a solution...

    --
  • Joey Echeverria at Jul 27, 2012 at 3:56 pm
    CDH4 changed JobContext to be an interface. You'll have to modify the
    LZO input format to make it compatible with CDH4.

    -Joey
    On Fri, Jul 27, 2012 at 5:00 AM, DrScott wrote:
    just a small correction: i use this fork:
    https://github.com/cloudera/hadoop-lzo.git

    Still looking for a solution...

    --



    --
    Joey Echeverria
    Principal Solutions Architect
    Cloudera, Inc.

    --
  • DrScott at Aug 1, 2012 at 2:51 pm
    Hi,
    On Friday, July 27, 2012 5:56:47 PM UTC+2, Joey Echeverria wrote:

    CDH4 changed JobContext to be an interface. You'll have to modify the
    LZO input format to make it compatible with CDH4.
    Finially it seems that i managed to do it. At least my first tests are
    successfull. Here are the steps:

    1. get the source from https://github.com/cloudera/hadoop-lzo.git
    2. in lib/ replace hadoop-core-0.20.2-cdh3u1.jar by
    hadoop-2.0.0-mr1-cdh4.0.1-core.jar and hadoop-common-2.0.0-cdh4.0.1.jar
    3. apply the attached patch
    4. ant compile-native && ant jar
    5. distribute libraries

    I only tested with the new mapreduce api.

    I still wonder if i am the only one using cdh4+lzo+newapi? Am i the only
    one having problems with indexed lzo in that setup?
    Maybe someone can apply the changes to
    https://github.com/cloudera/hadoop-lzo.git or a forked repository?

    --
  • Todd Lipcon at Aug 1, 2012 at 11:09 pm
    Hi Henning,

    That patch looks good, but the tests need some updating as well to
    compile against CDH4.

    I started the same patch last week, but then got distracted by some
    other duties. Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.

    Todd
    On Wed, Aug 1, 2012 at 7:50 AM, DrScott wrote:
    Hi,

    On Friday, July 27, 2012 5:56:47 PM UTC+2, Joey Echeverria wrote:

    CDH4 changed JobContext to be an interface. You'll have to modify the
    LZO input format to make it compatible with CDH4.

    Finially it seems that i managed to do it. At least my first tests are
    successfull. Here are the steps:

    1. get the source from https://github.com/cloudera/hadoop-lzo.git
    2. in lib/ replace hadoop-core-0.20.2-cdh3u1.jar by
    hadoop-2.0.0-mr1-cdh4.0.1-core.jar and hadoop-common-2.0.0-cdh4.0.1.jar
    3. apply the attached patch
    4. ant compile-native && ant jar
    5. distribute libraries

    I only tested with the new mapreduce api.

    I still wonder if i am the only one using cdh4+lzo+newapi? Am i the only one
    having problems with indexed lzo in that setup?
    Maybe someone can apply the changes to
    https://github.com/cloudera/hadoop-lzo.git or a forked repository?

    --



    --
    Todd Lipcon
    Software Engineer, Cloudera

    --
  • Henning Moll at Aug 4, 2012 at 4:03 pm

    Am Mittwoch, 1. August 2012, 16:08:44 schrieb Todd Lipcon:
    That patch looks good, but the tests need some updating as well to
    compile against CDH4.
    [...]Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4?
    I can work on it next week. I keep you updated. Even if i fail... ;-)

    Regards
    Henning

    --
  • DrScott at Aug 6, 2012 at 2:15 pm
    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add some
    jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the test
    run:

         [junit] Running com.hadoop.compression.lzo.TestLzoCodec
         [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217 sec
         [junit] Running com.hadoop.compression.lzo.TestLzoRandData
         [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043 sec
         [junit] Running com.hadoop.compression.lzo.TestLzopInputStream
         [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075 sec
         [junit] Running com.hadoop.compression.lzo.TestLzopOutputStream
         [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522 sec
         [junit] Running com.hadoop.mapreduce.TestLzoTextInputFormat
         [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning

    --
  • Clintz at Aug 12, 2012 at 9:26 pm
    Has this been committed? I too fell into this frustration with CDH4. I
    too am using https://github.com/toddlipcon/hadoop-lzo-packager .
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add some
    jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the test
    run:

    [junit] Running com.hadoop.compression.lzo.TestLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.TestLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.TestLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.TestLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.TestLzoTextInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --
  • Karthik Kambatla at Aug 13, 2012 at 9:35 pm
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik
    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with CDH4. I
    too am using https://github.com/toddlipcon/hadoop-lzo-packager .
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add some
    jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the test
    run:

    [junit] Running com.hadoop.compression.lzo.**TestLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTextInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --
  • Clintz at Aug 14, 2012 at 2:38 pm
    Thank you. On a side note, would it be too much to ask if the install
    directory were configurable via a parameter passed into run.sh ? The
    reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native ,
    no longer /usr/lib/hadoop/native/{build.platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.
    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz <christoph...@gmail.com<javascript:>
    wrote:
    Has this been committed? I too fell into this frustration with CDH4. I
    too am using https://github.com/toddlipcon/hadoop-lzo-packager .
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add some
    jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the test
    run:

    [junit] Running com.hadoop.compression.lzo.**TestLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.**TestLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTextInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --
  • Karthik Kambatla at Aug 14, 2012 at 4:52 pm
    Thanks for your feedback, Clintz. Will definitely try to incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the install
    directory were configurable via a parameter passed into run.sh ? The
    reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native ,
    no longer /usr/lib/hadoop/native/{build.platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with CDH4.
    I too am using https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add
    some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the
    test run:

    [junit] Running com.hadoop.compression.lzo.**Tes**tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522
    sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe**xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309
    sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --
  • Clintz at Aug 18, 2012 at 8:42 pm
    Hi Karthik. Is there an update on this commit? Thanks for your help
    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate it.

    Karthik

    On Tue, Aug 14, 2012 at 7:30 AM, clintz <christoph...@gmail.com<javascript:>
    wrote:
    Thank you. On a side note, would it be too much to ask if the install
    directory were configurable via a parameter passed into run.sh ? The
    reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native ,
    no longer /usr/lib/hadoop/native/{build.platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with CDH4.
    I too am using https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add
    some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the
    test run:

    [junit] Running com.hadoop.compression.lzo.**Tes**tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes**
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522
    sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe**xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309
    sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --
  • Karthik Kambatla at Aug 19, 2012 at 3:56 am
    Hi Clint

    I was about to update you guys on the status. I had a chance to work on the
    patch for a bit, and it needs a few more things done before we can commit
    it.

    I am scheduled to work on it further next week, and will keep you updated
    on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the install
    directory were configurable via a parameter passed into run.sh ? The
    reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native ,
    no longer /usr/lib/hadoop/native/{build.**platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with CDH4.
    I too am using https://github.com/toddlipcon/****hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add
    some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the
    test run:

    [junit] Running com.hadoop.compression.lzo.**Tes****tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0,217
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes****tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4,043
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,075
    sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0,522
    sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe****xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4,309
    sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --
  • Karthik Kambatla at Aug 20, 2012 at 3:58 pm
    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests -
    https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip. We will run a few
    sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work on
    the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you updated
    on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the install
    directory were configurable via a parameter passed into run.sh ? The
    reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native ,
    no longer /usr/lib/hadoop/native/{build.**platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with CDH4.
    I too am using https://github.com/toddlipcon/****
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add
    some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the
    test run:

    [junit] Running com.hadoop.compression.lzo.**Tes****tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe****
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --
  • James Warren at Nov 4, 2012 at 1:34 am
    I finally got around to upgrading to CDH4 and ran smack into this as well.
      Have there been any updates to this in the last couple of months?

    cheers,
    -James
    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests -
    https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip. We will run a few
    sanity checks before committing it to the official tree.

    Thanks
    Karthik

    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla <ka...@cloudera.com<javascript:>
    wrote:
    Hi Clint

    I was about to update you guys on the status. I had a chance to work on
    the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you updated
    on the progress.

    Thanks a lot for your patience.

    Karthik

    On Sat, Aug 18, 2012 at 1:42 PM, clintz <christoph...@gmail.com<javascript:>
    wrote:
    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.**platform} . If this
    isn't configurable, i have to modify template.spec to move natives
    appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/****
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to add
    some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of the
    test run:

    [junit] Running com.hadoop.compression.lzo.**Tes****tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes****
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe****
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --
  • Santhosh Srinivasan at Nov 5, 2012 at 7:35 pm
    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as well.
    Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests -
    https://github.com/kambatla/**hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work on
    the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate
    it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.****platform} . If this
    isn't configurable, i have to modify template.spec to move natives
    appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/******
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe******
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --
  • Michael S at Dec 6, 2012 at 9:36 pm
    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/hadoop-lzo

    Build it with packager by repo
    https://github.com/toddlipcon/hadoop-lzo-packager

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
         #GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-cloudera}
         GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
      sudo rpm -iUhv
    /$PATH-TO-PACKAGER/hadoop-lzo-packager-master-cdh4/build/topdir/RPMS/x86_64/kambatla-hadoop-lzo-20121206162545.8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
      sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
      sudo cp -a /usr/lib/hadoop/lib/native/Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh

    On Sat, Nov 3, 2012 at 6:34 PM, James Warren <jamesw...@gmail.com<javascript:>
    wrote:
    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests
    - https://github.com/kambatla/**hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work on
    the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate
    it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.****platform} . If
    this isn't configurable, i have to modify template.spec to move natives
    appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/******
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe******
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --
  • David Koch at Dec 13, 2012 at 10:27 pm
    Hello,

    Thank you everyone in this thread for the information. I followed Michael's
    instruction - apart from the installation of the RPM. Is this necessary?

    How do I correctly generate *.lzo compressed files to be used by Hadoop? In
    my tests I used the Linux lzop command line utility which I installed with
    a package manager, copied the resulting file to HDFS and then ran
    hadoop-lzo indexer. Input is properly decompressed by Map Reduce jobs,
    however, splitting does not work and there is only one mapper instance even
    for larger files >2gb. What causes this?

    Some information:

    lzop --version
    lzop 1.02rc1
    LZO library 2.03
    Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer

    I do compression, and indexing on a Debian machine, however, the hadoop-lzo
    jar and native files where built on a CentOS machine. Could the lzop
    version be too old or is lzop not supposed to be used at all?

    Thank you,

    /David


    On Thursday, December 6, 2012 10:36:26 PM UTC+1, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/hadoop-lzo

    Build it with packager by repo
    https://github.com/toddlipcon/hadoop-lzo-packager

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv
    /$PATH-TO-PACKAGER/hadoop-lzo-packager-master-cdh4/build/topdir/RPMS/x86_64/kambatla-hadoop-lzo-20121206162545.8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests
    - https://github.com/kambatla/**hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate
    it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.****platform} . If
    this isn't configurable, i have to modify template.spec to move natives
    appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/******
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe******
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --
  • David Koch at Dec 15, 2012 at 4:19 pm
    Ok, answering my own question here. I was not using the proper input
    format. Once I set it
    to com.hadoop.mapreduce.LzoTextInputFormat.LzoTextInputFormat it worked ok.

    To complement Michael's instructions: I also had to install the pacakge
    containing lzo header files on all tasktrackers. (lzo-devel for
    Redhat/CentOs).

    /David
    On Thu, Dec 13, 2012 at 11:27 PM, David Koch wrote:

    Hello,

    Thank you everyone in this thread for the information. I followed
    Michael's instruction - apart from the installation of the RPM. Is this
    necessary?

    How do I correctly generate *.lzo compressed files to be used by Hadoop?
    In my tests I used the Linux lzop command line utility which I installed
    with a package manager, copied the resulting file to HDFS and then ran
    hadoop-lzo indexer. Input is properly decompressed by Map Reduce jobs,
    however, splitting does not work and there is only one mapper instance even
    for larger files >2gb. What causes this?

    Some information:

    lzop --version
    lzop 1.02rc1
    LZO library 2.03
    Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer

    I do compression, and indexing on a Debian machine, however, the
    hadoop-lzo jar and native files where built on a CentOS machine. Could the
    lzop version be too old or is lzop not supposed to be used at all?

    Thank you,

    /David


    On Thursday, December 6, 2012 10:36:26 PM UTC+1, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/**hadoop-lzo<https://github.com/kambatla/hadoop-lzo>

    Build it with packager by repo
    https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv /$PATH-TO-PACKAGER/hadoop-lzo-**
    packager-master-cdh4/build/**topdir/RPMS/x86_64/kambatla-**
    hadoop-lzo-20121206162545.**8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/**Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=**lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/****hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.******platform} . If
    this isn't configurable, i have to modify template.spec to move natives
    appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/********
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe********
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --

    --
  • Karthik Kambatla at Dec 17, 2012 at 10:01 pm
    Thanks Michael and David for listing the instructions to use LZO with CDH4.

    In addition to the steps Michael and David have proposed, making sure the
    requirements listed in
    https://github.com/kambatla/hadoop-lzo/blob/master/UsingWithCDH4.txt are
    available is also good.

    Thanks
    Karthik
    On Sat, Dec 15, 2012 at 8:18 AM, David Koch wrote:

    Ok, answering my own question here. I was not using the proper input
    format. Once I set it
    to com.hadoop.mapreduce.LzoTextInputFormat.LzoTextInputFormat it worked ok.

    To complement Michael's instructions: I also had to install the pacakge
    containing lzo header files on all tasktrackers. (lzo-devel for
    Redhat/CentOs).

    /David
    On Thu, Dec 13, 2012 at 11:27 PM, David Koch wrote:

    Hello,

    Thank you everyone in this thread for the information. I followed
    Michael's instruction - apart from the installation of the RPM. Is this
    necessary?

    How do I correctly generate *.lzo compressed files to be used by Hadoop?
    In my tests I used the Linux lzop command line utility which I installed
    with a package manager, copied the resulting file to HDFS and then ran
    hadoop-lzo indexer. Input is properly decompressed by Map Reduce jobs,
    however, splitting does not work and there is only one mapper instance even
    for larger files >2gb. What causes this?

    Some information:

    lzop --version
    lzop 1.02rc1
    LZO library 2.03
    Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer

    I do compression, and indexing on a Debian machine, however, the
    hadoop-lzo jar and native files where built on a CentOS machine. Could the
    lzop version be too old or is lzop not supposed to be used at all?

    Thank you,

    /David


    On Thursday, December 6, 2012 10:36:26 PM UTC+1, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/**hadoop-lzo<https://github.com/kambatla/hadoop-lzo>

    Build it with packager by repo
    https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv /$PATH-TO-PACKAGER/hadoop-lzo-**
    packager-master-cdh4/build/**topdir/RPMS/x86_64/kambatla-**
    hadoop-lzo-20121206162545.**8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/**Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=**lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/****hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik

    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla <ka...@cloudera.com
    wrote:
    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.******platform} .
    If this isn't configurable, i have to modify template.spec to move natives
    appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/********
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,

    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon
    wrote:
    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.
    Ok, here's a second patch for the test classes. Again i need
    to add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result
    of the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe********
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time
    elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --

    --


    --
  • Mike R. at Apr 1, 2013 at 3:00 pm
    I made a fork of Todd's hadoop-lzo-packager here
    https://github.com/mroark1m/hadoop-lzo-packager
    that produces an RPM that is working (so-far) locally for CDH 4.2.0 and
    mrv1. If you try it, check the README.CDH4 file...
    On Thursday, December 6, 2012 3:36:26 PM UTC-6, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/hadoop-lzo

    Build it with packager by repo
    https://github.com/toddlipcon/hadoop-lzo-packager

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv
    /$PATH-TO-PACKAGER/hadoop-lzo-packager-master-cdh4/build/topdir/RPMS/x86_64/kambatla-hadoop-lzo-20121206162545.8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass tests
    - https://github.com/kambatla/**hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help

    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla wrote:

    Thanks for your feedback, Clintz. Will definitely try to incorporate
    it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.****platform} . If
    this isn't configurable, i have to modify template.spec to move natives
    appropriately.

    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla wrote:

    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/******
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes******
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe******
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --
  • Deepak Gattala at Apr 2, 2013 at 5:37 am
    Hi all ,

    can you please tell me how i can enable compression on my cluster i am
    using CDH 4.2 and CM 4.5.

    Thanks
    Deepak Gattala


    On Mon, Apr 1, 2013 at 9:59 AM, Mike R. wrote:

    I made a fork of Todd's hadoop-lzo-packager here
    https://github.com/mroark1m/hadoop-lzo-packager
    that produces an RPM that is working (so-far) locally for CDH 4.2.0 and
    mrv1. If you try it, check the README.CDH4 file...

    On Thursday, December 6, 2012 3:36:26 PM UTC-6, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/**hadoop-lzo<https://github.com/kambatla/hadoop-lzo>

    Build it with packager by repo
    https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv /$PATH-TO-PACKAGER/hadoop-lzo-**
    packager-master-cdh4/build/**topdir/RPMS/x86_64/kambatla-**
    hadoop-lzo-20121206162545.**8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/**Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=**lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/****hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik
    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla wrote:

    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.******platform} . If
    this isn't configurable, i have to modify template.spec to move natives
    appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/********
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,
    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon wrote:

    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.
    Ok, here's a second patch for the test classes. Again i need to
    add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result of
    the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed:
    4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed:
    0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe********
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed:
    4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --

    --
  • Karthik Kambatla at Apr 15, 2013 at 9:29 pm
    Hadoop LZO packages for CDH4.x are available at
    http://archive.cloudera.com/gplextras/. The documentation for installation
    will be addressed in upcoming releases.

    Thanks
    Karthik
    On Mon, Apr 1, 2013 at 10:37 PM, Deepak Gattala wrote:

    Hi all ,

    can you please tell me how i can enable compression on my cluster i am
    using CDH 4.2 and CM 4.5.

    Thanks
    Deepak Gattala


    On Mon, Apr 1, 2013 at 9:59 AM, Mike R. wrote:

    I made a fork of Todd's hadoop-lzo-packager here
    https://github.com/mroark1m/hadoop-lzo-packager
    that produces an RPM that is working (so-far) locally for CDH 4.2.0 and
    mrv1. If you try it, check the README.CDH4 file...

    On Thursday, December 6, 2012 3:36:26 PM UTC-6, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/**hadoop-lzo<https://github.com/kambatla/hadoop-lzo>

    Build it with packager by repo
    https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv /$PATH-TO-PACKAGER/hadoop-lzo-**
    packager-master-cdh4/build/**topdir/RPMS/x86_64/kambatla-**
    hadoop-lzo-20121206162545.**8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/**Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires compression
    set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=**lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has been
    tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/****hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik

    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla <ka...@cloudera.com
    wrote:
    Hi Clint

    I was about to update you guys on the status. I had a chance to work
    on the patch for a bit, and it needs a few more things done before we can
    commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.******platform} .
    If this isn't configurable, i have to modify template.spec to move natives
    appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik

    On Sun, Aug 12, 2012 at 2:26 PM, clintz wrote:

    Has this been committed? I too fell into this frustration with
    CDH4. I too am using https://github.com/toddlipcon/********
    hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,

    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon
    wrote:
    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.
    Ok, here's a second patch for the test classes. Again i need
    to add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result
    of the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe********
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time
    elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --

    --


    --
  • James Kebinger at Apr 17, 2013 at 2:53 pm
    Is there also an official repo containing the hadoop-lzo code including
    fixed LzoTextInputFormat?

    On Mon, Apr 15, 2013 at 5:29 PM, Karthik Kambatla wrote:

    Hadoop LZO packages for CDH4.x are available at
    http://archive.cloudera.com/gplextras/. The documentation for
    installation will be addressed in upcoming releases.

    Thanks
    Karthik
    On Mon, Apr 1, 2013 at 10:37 PM, Deepak Gattala wrote:

    Hi all ,

    can you please tell me how i can enable compression on my cluster i am
    using CDH 4.2 and CM 4.5.

    Thanks
    Deepak Gattala


    On Mon, Apr 1, 2013 at 9:59 AM, Mike R. wrote:

    I made a fork of Todd's hadoop-lzo-packager here
    https://github.com/mroark1m/hadoop-lzo-packager
    that produces an RPM that is working (so-far) locally for CDH 4.2.0 and
    mrv1. If you try it, check the README.CDH4 file...

    On Thursday, December 6, 2012 3:36:26 PM UTC-6, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/**hadoop-lzo<https://github.com/kambatla/hadoop-lzo>

    Build it with packager by repo
    https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_**ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv /$PATH-TO-PACKAGER/hadoop-lzo-**
    packager-master-cdh4/build/**topdir/RPMS/x86_64/kambatla-**
    hadoop-lzo-20121206162545.**8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/**Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires
    compression set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=**lzo

    It succeeded as expected.

    Good luck,

    Michael
    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan wrote:

    James,

    You can use the repository that Karthik has his changes on. It has
    been tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh
    On Sat, Nov 3, 2012 at 6:34 PM, James Warren wrote:

    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/****hadoop-lzo/tree/cdh4-wip<https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip>.
    We will run a few sanity checks before committing it to the official tree.

    Thanks
    Karthik

    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla <
    ka...@cloudera.com> wrote:
    Hi Clint

    I was about to update you guys on the status. I had a chance to
    work on the patch for a bit, and it needs a few more things done before we
    can commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik
    On Sat, Aug 18, 2012 at 1:42 PM, clintz wrote:

    Hi Karthik. Is there an update on this commit? Thanks for your
    help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik
    On Tue, Aug 14, 2012 at 7:30 AM, clintz wrote:

    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.******platform} .
    If this isn't configurable, i have to modify template.spec to move natives
    appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik


    On Sun, Aug 12, 2012 at 2:26 PM, clintz <christoph...@gmail.com
    wrote:
    Has this been committed? I too fell into this frustration
    with CDH4. I too am using https://github.com/toddlipcon/****
    ****hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,

    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon
    wrote:
    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.
    Ok, here's a second patch for the test classes. Again i need
    to add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result
    of the test run:

    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 0,217 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 4,043 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,075 sec
    [junit] Running com.hadoop.compression.lzo.**Tes********
    tLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,522 sec
    [junit] Running com.hadoop.mapreduce.**TestLzoTe********
    xtInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time
    elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --


    --

    --

    --

    --

    --


    --


    --
  • Harsh J at Apr 18, 2013 at 5:10 am
    James,

    Are you not finding it under the new gplextras repository, which also
    has hadoop-lzo?
    On Wed, Apr 17, 2013 at 8:18 PM, James Kebinger wrote:
    Is there also an official repo containing the hadoop-lzo code including
    fixed LzoTextInputFormat?

    On Mon, Apr 15, 2013 at 5:29 PM, Karthik Kambatla wrote:

    Hadoop LZO packages for CDH4.x are available at
    http://archive.cloudera.com/gplextras/. The documentation for installation
    will be addressed in upcoming releases.

    Thanks
    Karthik

    On Mon, Apr 1, 2013 at 10:37 PM, Deepak Gattala <gvr.deepak@gmail.com>
    wrote:
    Hi all ,

    can you please tell me how i can enable compression on my cluster i am
    using CDH 4.2 and CM 4.5.

    Thanks
    Deepak Gattala


    On Mon, Apr 1, 2013 at 9:59 AM, Mike R. wrote:

    I made a fork of Todd's hadoop-lzo-packager here
    https://github.com/mroark1m/hadoop-lzo-packager
    that produces an RPM that is working (so-far) locally for CDH 4.2.0 and
    mrv1. If you try it, check the README.CDH4 file...

    On Thursday, December 6, 2012 3:36:26 PM UTC-6, Michael S wrote:

    Hi, all:
    I was able to do it. Here is a recording to help others.

    Source repo
    I used repo
    https://github.com/kambatla/hadoop-lzo

    Build it with packager by repo
    https://github.com/toddlipcon/hadoop-lzo-packager

    I had to revise the file run.sh to use the kambatla repo in github as:
    setup_github() {
    #GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-cloudera}
    GITHUB_ACCOUNT=${GITHUB_ACCOUNT:-kambatla}

    build it by
    ./run.sh --no-deb

    Install
    First install the rpm
    sudo rpm -iUhv
    /$PATH-TO-PACKAGER/hadoop-lzo-packager-master-cdh4/build/topdir/RPMS/x86_64/kambatla-hadoop-lzo-20121206162545.8aa0605-1.x86_64.rpm

    Then copy necessary files to the CDH4 directory
    sudo cp -a /usr/lib/hadoop-0.20/lib/* /usr/lib/hadoop/lib
    sudo cp -a /usr/lib/hadoop/lib/native/Linux-amd64-64/*
    /usr/lib/hadoop/lib/native

    Of course the config stuff as documented in the two repos.

    I tested LZO compression by running a Pig job which requires
    compression set by pig.properties as:
    pig.tmpfilecompression=true
    pig.tmpfilecompression.codec=lzo

    It succeeded as expected.

    Good luck,

    Michael

    On Monday, November 5, 2012 2:35:57 PM UTC-5, Santhosh Srinivasan
    wrote:
    James,

    You can use the repository that Karthik has his changes on. It has
    been tested and is pending a commit to the official Cloudera repository.

    Thanks,
    Santhosh

    On Sat, Nov 3, 2012 at 6:34 PM, James Warren <jamesw...@gmail.com>
    wrote:
    I finally got around to upgrading to CDH4 and ran smack into this as
    well. Have there been any updates to this in the last couple of months?

    cheers,
    -James

    On Monday, August 20, 2012 8:58:27 AM UTC-7, Karthik Kambatla wrote:

    Thanks a lot for your patches, Henning.

    I have pushed your changes to get the project to compile and pass
    tests - https://github.com/kambatla/hadoop-lzo/tree/cdh4-wip. We will run a
    few sanity checks before committing it to the official tree.

    Thanks
    Karthik

    On Sat, Aug 18, 2012 at 8:56 PM, Karthik Kambatla
    wrote:
    Hi Clint

    I was about to update you guys on the status. I had a chance to
    work on the patch for a bit, and it needs a few more things done before we
    can commit it.

    I am scheduled to work on it further next week, and will keep you
    updated on the progress.

    Thanks a lot for your patience.

    Karthik

    On Sat, Aug 18, 2012 at 1:42 PM, clintz <christoph...@gmail.com>
    wrote:
    Hi Karthik. Is there an update on this commit? Thanks for your
    help


    On Tuesday, August 14, 2012 10:52:19 AM UTC-6, Karthik Kambatla
    wrote:
    Thanks for your feedback, Clintz. Will definitely try to
    incorporate it.

    Karthik

    On Tue, Aug 14, 2012 at 7:30 AM, clintz <christoph...@gmail.com>
    wrote:
    Thank you. On a side note, would it be too much to ask if the
    install directory were configurable via a parameter passed into run.sh ?
    The reason being is the CDH4 picks up natives in /usr/lib/hadoop/lib/native
    , no longer /usr/lib/hadoop/native/{build.platform} . If this isn't
    configurable, i have to modify template.spec to move natives appropriately.


    On Monday, August 13, 2012 3:34:57 PM UTC-6, Karthik Kambatla
    wrote:
    Hi clintz

    We are working on it, and hope to have it committed this week.

    Thanks
    Karthik


    On Sun, Aug 12, 2012 at 2:26 PM, clintz
    wrote:
    Has this been committed? I too fell into this frustration
    with CDH4. I too am using
    https://github.com/toddlipcon/hadoop-lzo-packager .
    On Monday, August 6, 2012 8:15:34 AM UTC-6, DrScott wrote:

    Hi Todd,

    On Thursday, August 2, 2012 1:08:44 AM UTC+2, Todd Lipcon
    wrote:
    Do you have time to get the test cases passing in the
    hadoop-lzo project against CDH4? I'll gladly commit it for
    you.

    Ok, here's a second patch for the test classes. Again i need
    to add some jar files to the lib directory:

    commons-configuration-1.6.jar
    guava-11.0.2.jar
    commons-lang-2.5.jar
    hadoop-auth-2.0.0-cdh4.0.1.jar
    slf4j-api-1.6.1.jar

    All of them are included in my CDH4 installation. The result
    of the test run:

    [junit] Running com.hadoop.compression.lzo.TestLzoCodec
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 0,217 sec
    [junit] Running
    com.hadoop.compression.lzo.TestLzoRandData
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time
    elapsed: 4,043 sec
    [junit] Running
    com.hadoop.compression.lzo.TestLzopInputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,075 sec
    [junit] Running
    com.hadoop.compression.lzo.TestLzopOutputStream
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time
    elapsed: 0,522 sec
    [junit] Running
    com.hadoop.mapreduce.TestLzoTextInputFormat
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time
    elapsed: 4,309 sec

    BUILD SUCCESSFUL

    Hth. Regards,
    Henning
    --

    --

    --

    --

    --


    --


    --


    --



    --
    Harsh J

    --
  • DrScott at Aug 14, 2012 at 7:05 am

    On Sunday, August 12, 2012 11:26:34 PM UTC+2, clintz wrote:
    I too am using https://github.com/toddlipcon/hadoop-lzo-packager .
    DrScott wrote:
    i installed hadoop-lzo (fork from toddlipcon)
    I have to correct myself: I am working with code based on
    https://github.com/cloudera/hadoop-lzo.git
    But i think there is no much difference.

    Regards
    Henning

    --
  • Antonio Piccolboni at Dec 21, 2012 at 9:37 pm
    Hi guys,
    I hope this is not hijacking the thread but I am stuck one step behind

    -bash-3.2$ sudo yum install lzo-devel
    Loaded plugins: fastestmirror
    Loading mirror speeds from cached hostfile
      * base: mirrors.lga7.us.voxel.net
      * extras: mirrors.lga7.us.voxel.net
      * updates: mirror.atlanticmetro.net
    Setting up Install Process
    No package lzo-devel available.
    Nothing to do
    -bash-3.2$

    -bash-3.2$ yum repolist |grep cloudera
    cloudera-cdh4 Cloudera's Distribution for Hadoop, Version 4
       83

    What am I missing?
    Thanks

    Antonio

    On Tuesday, August 14, 2012 12:00:39 AM UTC-7, DrScott wrote:
    On Sunday, August 12, 2012 11:26:34 PM UTC+2, clintz wrote:

    I too am using https://github.com/toddlipcon/hadoop-lzo-packager .
    DrScott wrote:
    i installed hadoop-lzo (fork from toddlipcon)
    I have to correct myself: I am working with code based on
    https://github.com/cloudera/hadoop-lzo.git
    But i think there is no much difference.

    Regards
    Henning
    --
  • Mark Grover at Dec 21, 2012 at 11:40 pm
    Antonio,
    If I am not mistaken, the lzo-devel package comes from your CentOS/RHEL
    base (os) repo.

    For example, here is a list of mirrors for CentOS 6:
    http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os

    If you go the first mirror, for instance, you'd see lzo-devel package under
    http://mirror.5ninesolutions.com/centos/6/os/x86_64/Packages/

    Mark
    On Fri, Dec 21, 2012 at 1:36 PM, Antonio Piccolboni wrote:

    Hi guys,
    I hope this is not hijacking the thread but I am stuck one step behind

    -bash-3.2$ sudo yum install lzo-devel
    Loaded plugins: fastestmirror
    Loading mirror speeds from cached hostfile
    * base: mirrors.lga7.us.voxel.net
    * extras: mirrors.lga7.us.voxel.net
    * updates: mirror.atlanticmetro.net
    Setting up Install Process
    No package lzo-devel available.
    Nothing to do
    -bash-3.2$

    -bash-3.2$ yum repolist |grep cloudera
    cloudera-cdh4 Cloudera's Distribution for Hadoop, Version 4
    83

    What am I missing?
    Thanks

    Antonio

    On Tuesday, August 14, 2012 12:00:39 AM UTC-7, DrScott wrote:
    On Sunday, August 12, 2012 11:26:34 PM UTC+2, clintz wrote:

    I too am using https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    DrScott wrote:
    i installed hadoop-lzo (fork from toddlipcon)
    I have to correct myself: I am working with code based on
    https://github.com/cloudera/**hadoop-lzo.git<https://github.com/cloudera/hadoop-lzo.git>
    But i think there is no much difference.

    Regards
    Henning
    --


    --
  • Antonio Piccolboni at Dec 22, 2012 at 1:11 am
    Unfortunately whirr doesn''t support yet centos 6, I was trying with centos
    5. Will try whirr from nightly builds, I guess that's the last resort.


    Antonio

    On Fri, Dec 21, 2012 at 3:40 PM, Mark Grover wrote:

    Antonio,
    If I am not mistaken, the lzo-devel package comes from your CentOS/RHEL
    base (os) repo.

    For example, here is a list of mirrors for CentOS 6:
    http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os

    If you go the first mirror, for instance, you'd see lzo-devel package
    under http://mirror.5ninesolutions.com/centos/6/os/x86_64/Packages/

    Mark
    On Fri, Dec 21, 2012 at 1:36 PM, Antonio Piccolboni wrote:

    Hi guys,
    I hope this is not hijacking the thread but I am stuck one step behind

    -bash-3.2$ sudo yum install lzo-devel
    Loaded plugins: fastestmirror
    Loading mirror speeds from cached hostfile
    * base: mirrors.lga7.us.voxel.net
    * extras: mirrors.lga7.us.voxel.net
    * updates: mirror.atlanticmetro.net
    Setting up Install Process
    No package lzo-devel available.
    Nothing to do
    -bash-3.2$

    -bash-3.2$ yum repolist |grep cloudera
    cloudera-cdh4 Cloudera's Distribution for Hadoop, Version 4
    83

    What am I missing?
    Thanks

    Antonio

    On Tuesday, August 14, 2012 12:00:39 AM UTC-7, DrScott wrote:
    On Sunday, August 12, 2012 11:26:34 PM UTC+2, clintz wrote:

    I too am using https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    DrScott wrote:
    i installed hadoop-lzo (fork from toddlipcon)
    I have to correct myself: I am working with code based on
    https://github.com/cloudera/**hadoop-lzo.git<https://github.com/cloudera/hadoop-lzo.git>
    But i think there is no much difference.

    Regards
    Henning
    --


    --


    --
  • Antonio Piccolboni at Dec 22, 2012 at 1:14 am
    I meant, get a nightly build.

    A
    On Friday, December 21, 2012 5:11:33 PM UTC-8, Antonio Piccolboni wrote:

    Unfortunately whirr doesn''t support yet centos 6, I was trying with
    centos 5. Will try whirr from nightly builds, I guess that's the last
    resort.


    Antonio


    On Fri, Dec 21, 2012 at 3:40 PM, Mark Grover <mgr...@cloudera.com<javascript:>
    wrote:
    Antonio,
    If I am not mistaken, the lzo-devel package comes from your CentOS/RHEL
    base (os) repo.

    For example, here is a list of mirrors for CentOS 6:
    http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os

    If you go the first mirror, for instance, you'd see lzo-devel package
    under http://mirror.5ninesolutions.com/centos/6/os/x86_64/Packages/

    Mark

    On Fri, Dec 21, 2012 at 1:36 PM, Antonio Piccolboni <picc...@gmail.com<javascript:>
    wrote:
    Hi guys,
    I hope this is not hijacking the thread but I am stuck one step behind

    -bash-3.2$ sudo yum install lzo-devel
    Loaded plugins: fastestmirror
    Loading mirror speeds from cached hostfile
    * base: mirrors.lga7.us.voxel.net
    * extras: mirrors.lga7.us.voxel.net
    * updates: mirror.atlanticmetro.net
    Setting up Install Process
    No package lzo-devel available.
    Nothing to do
    -bash-3.2$

    -bash-3.2$ yum repolist |grep cloudera
    cloudera-cdh4 Cloudera's Distribution for Hadoop, Version 4
    83

    What am I missing?
    Thanks

    Antonio

    On Tuesday, August 14, 2012 12:00:39 AM UTC-7, DrScott wrote:
    On Sunday, August 12, 2012 11:26:34 PM UTC+2, clintz wrote:

    I too am using https://github.com/toddlipcon/**hadoop-lzo-packager<https://github.com/toddlipcon/hadoop-lzo-packager>.
    DrScott wrote:
    i installed hadoop-lzo (fork from toddlipcon)
    I have to correct myself: I am working with code based on
    https://github.com/cloudera/**hadoop-lzo.git<https://github.com/cloudera/hadoop-lzo.git>
    But i think there is no much difference.

    Regards
    Henning
    --


    --


    --
  • Miguel Escaja at Jan 15, 2013 at 1:02 pm
    Hi

    Sorry if I am using the wrong forum.

    I'd need help on a similar issue I have encountered with NoSQL+CDH4.1

    It's with regards to the KVHOME/example/haddop/CountMinorKeys.java which
    comes included in the NoSQL binaries does not seem to work with CDH4.1
    Community Edition. This is the error I get when I run it

    [admin@vm224 hadoop]$ hadoop jar hadoopSamples.jar hadoop.CountMinorKeys
    kvstore vm223.escaja.com:5000 /user/admin/CountMinorKeys/output/test01
    13/01/15 11:40:35 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    13/01/15 11:40:35 INFO mapred.JobClient: Cleaning up the staging area
    hdfs://vm224.escaja.com:8020/user/admin/.staging/job_201301150717_0007
    Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
    interface org.apache.hadoop.mapreduce.JobContext, but class was expected
             at
    oracle.kv.hadoop.KVInputFormatBase.getSplits(KVInputFormatBase.java:184)
             at
    org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1014)
             at
    org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1031)
             at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:172)
             at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:943)
             at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
             at java.security.AccessController.doPrivileged(Native Method)
             at javax.security.auth.Subject.doAs(Subject.java:396)
             at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
             at
    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
             at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
             at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:561)
             at hadoop.CountMinorKeys.run(CountMinorKeys.java:110)
             at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
             at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
             at hadoop.CountMinorKeys.main(CountMinorKeys.java:117)
             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
             at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
             at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
             at java.lang.reflect.Method.invoke(Method.java:597)
             at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

    El miércoles, 1 de agosto de 2012 15:50:57 UTC+1, DrScott escribió:
    Hi,
    On Friday, July 27, 2012 5:56:47 PM UTC+2, Joey Echeverria wrote:

    CDH4 changed JobContext to be an interface. You'll have to modify the
    LZO input format to make it compatible with CDH4.
    Finially it seems that i managed to do it. At least my first tests are
    successfull. Here are the steps:

    1. get the source from https://github.com/cloudera/hadoop-lzo.git
    2. in lib/ replace hadoop-core-0.20.2-cdh3u1.jar by
    hadoop-2.0.0-mr1-cdh4.0.1-core.jar and hadoop-common-2.0.0-cdh4.0.1.jar
    3. apply the attached patch
    4. ant compile-native && ant jar
    5. distribute libraries

    I only tested with the new mapreduce api.

    I still wonder if i am the only one using cdh4+lzo+newapi? Am i the only
    one having problems with indexed lzo in that setup?
    Maybe someone can apply the changes to
    https://github.com/cloudera/hadoop-lzo.git or a forked repository?
    --
  • James Kebinger at Mar 25, 2013 at 10:53 pm
    Seeing the same problem here - did the fixes never make it back into the
    cloudera repo? It seems like Karthick was on top of that last August


    On Tuesday, January 15, 2013 8:02:50 AM UTC-5, miguel...@ceitss.co.uk wrote:

    Hi

    Sorry if I am using the wrong forum.

    I'd need help on a similar issue I have encountered with NoSQL+CDH4.1

    It's with regards to the KVHOME/example/haddop/CountMinorKeys.java which
    comes included in the NoSQL binaries does not seem to work with CDH4.1
    Community Edition. This is the error I get when I run it

    [admin@vm224 hadoop]$ hadoop jar hadoopSamples.jar hadoop.CountMinorKeys
    kvstore vm223.escaja.com:5000 /user/admin/CountMinorKeys/output/test01
    13/01/15 11:40:35 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    13/01/15 11:40:35 INFO mapred.JobClient: Cleaning up the staging area
    hdfs://vm224.escaja.com:8020/user/admin/.staging/job_201301150717_0007
    Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
    interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at
    oracle.kv.hadoop.KVInputFormatBase.getSplits(KVInputFormatBase.java:184)
    at
    org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1014)
    at
    org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1031)
    at
    org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:172)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:943)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at
    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:561)
    at hadoop.CountMinorKeys.run(CountMinorKeys.java:110)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at hadoop.CountMinorKeys.main(CountMinorKeys.java:117)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

    El miércoles, 1 de agosto de 2012 15:50:57 UTC+1, DrScott escribió:
    Hi,
    On Friday, July 27, 2012 5:56:47 PM UTC+2, Joey Echeverria wrote:

    CDH4 changed JobContext to be an interface. You'll have to modify the
    LZO input format to make it compatible with CDH4.
    Finially it seems that i managed to do it. At least my first tests are
    successfull. Here are the steps:

    1. get the source from https://github.com/cloudera/hadoop-lzo.git
    2. in lib/ replace hadoop-core-0.20.2-cdh3u1.jar by
    hadoop-2.0.0-mr1-cdh4.0.1-core.jar and hadoop-common-2.0.0-cdh4.0.1.jar
    3. apply the attached patch
    4. ant compile-native && ant jar
    5. distribute libraries

    I only tested with the new mapreduce api.

    I still wonder if i am the only one using cdh4+lzo+newapi? Am i the only
    one having problems with indexed lzo in that setup?
    Maybe someone can apply the changes to
    https://github.com/cloudera/hadoop-lzo.git or a forked repository?
    --
  • Haitian1122 at Aug 7, 2013 at 9:22 am
    I use CDH4+LZO,but ERROR:
    Error: java.lang.IllegalArgumentException: Compression codec
    com.hadoop.compression.lzo.LzoCodec was not found.
    How do I setup lzo on CDH4 and how let hadoop know where the hadoop-lzo.jar
    is?

    --

    ---
    You received this message because you are subscribed to the Google Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscribe@cloudera.org.
    For more options, visit https://groups.google.com/a/cloudera.org/groups/opt_out.

Related Discussions

People

Translate

site design / logo © 2022 Grokbase