FAQ
Hi,

I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
(64bit Ubuntu 10.04)

I have downloaded the tar.gz and can build the project -

I tried to apply the patch from
https://issues.apache.org/jira/browse/HADOOP-6835
(specifically HADOOP-6835.v9.yahoo-0.20.2xx-branch.patch) and there is
an issue with the diff -git for
src/core/org/apache/hadoop/io/compress/GzipCodec.java

however I thought I had resolved this by working through the patch diffs.

using ant on the command line I was able to build the project again
and generate a new jar hadoop-0.20.3-dev-core.jar which I copied back
into the $HADOOP_HOME and started hadoop.

on running a test map reduce task using streaming

bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -input
/gzip -output /out -mapper cat -reducer wc

I get the following error in the task log

2010-09-10 16:27:54,706 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2010-09-10 16:27:55,001 INFO org.apache.hadoop.util.NativeCodeLoader:
Loaded the native-hadoop library
2010-09-10 16:27:55,002 INFO
org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded &
initialized native-zlib library
2010-09-10 16:27:55,004 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2010-09-10 16:27:55,012 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2010-09-10 16:27:55,110 INFO org.apache.hadoop.mapred.MapTask: data
buffer = 79691776/99614720
2010-09-10 16:27:55,110 INFO org.apache.hadoop.mapred.MapTask: record
buffer = 262144/327680
2010-09-10 16:27:55,171 INFO org.apache.hadoop.streaming.PipeMapRed:
PipeMapRed exec [/bin/cat]
2010-09-10 16:27:55,243 INFO org.apache.hadoop.streaming.PipeMapRed:
R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2010-09-10 16:27:55,244 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=1/1
2010-09-10 16:27:55,244 INFO org.apache.hadoop.streaming.PipeMapRed:
MRErrorThread done
2010-09-10 16:27:55,245 INFO org.apache.hadoop.streaming.PipeMapRed:
MROutputThread done
2010-09-10 16:27:55,245 INFO org.apache.hadoop.streaming.PipeMapRed:
mapRedFinished
2010-09-10 16:27:55,310 FATAL org.apache.hadoop.mapred.TaskTracker:
Error running child : java.lang.UnsatisfiedLinkError:
org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(J)I
at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(Native
Method)
at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(ZlibDecompressor.java:260)
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:93)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
at java.io.InputStream.read(InputStream.java:85)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

Any thoughts or pointers how to apply the patch gratefully received.

Thanks,

Lewis.

Search Discussions

  • Greg Roelofs at Sep 10, 2010 at 7:40 pm

    Lewis Crawford wrote:

    I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
    (64bit Ubuntu 10.04)
    using ant on the command line I was able to build the project again
    and generate a new jar hadoop-0.20.3-dev-core.jar which I copied back
    into the $HADOOP_HOME and started hadoop.
    Only the jar?
    I get the following error in the task log
    2010-09-10 16:27:54,706 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=MAP, sessionId=
    2010-09-10 16:27:55,001 INFO org.apache.hadoop.util.NativeCodeLoader:
    Loaded the native-hadoop library
    2010-09-10 16:27:55,002 INFO
    org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded &
    initialized native-zlib library [...]
    2010-09-10 16:27:55,310 FATAL org.apache.hadoop.mapred.TaskTracker:
    Error running child : java.lang.UnsatisfiedLinkError:
    org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(J)I
    at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(Native
    Method)
    at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(ZlibDecompressor.java:260)
    The patch modifies both the Java and the C code. Your configuration obviously
    specifies native (C) code, but it looks like you didn't recompile and install
    the native libraries, so you're still using the stock version. getRemaining()
    is a new method in the Decompressor interface.

    I'd suggest you first simply change your config to disable native libraries
    (hadoop.native.lib = false) and make sure your backported patch works OK.
    6835 supports both native and Java gzip. Then build the native code ("ant
    -Dcompile.native=true mvn-install" or perhaps "ant -Dcompile.native=true jar"
    --I haven't tried the latter) and install it in the "appropriate location."
    (Sorry, I don't recall offhand where that is, but I'm pretty sure it's fully
    documented in the LZO docs, and I recall coming across a generic native-code
    twiki on the Apache Hadoop site somewhere, too.) I think that's the only
    thing you're missing.

    Greg
  • Lewis Crawford at Sep 10, 2010 at 8:55 pm
    Yes that seems to have done the trick!

    Thanks

    Lewis.
    On 10 September 2010 20:39, Greg Roelofs wrote:
    Lewis Crawford wrote:
    I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
    (64bit Ubuntu 10.04)
    using ant on the command line I was able to build the project again
    and generate a new jar hadoop-0.20.3-dev-core.jar which I copied back
    into the $HADOOP_HOME and started hadoop.
    Only the jar?
    I get the following error in the task log
    2010-09-10 16:27:54,706 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=MAP, sessionId=
    2010-09-10 16:27:55,001 INFO org.apache.hadoop.util.NativeCodeLoader:
    Loaded the native-hadoop library
    2010-09-10 16:27:55,002 INFO
    org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded &
    initialized native-zlib library [...]
    2010-09-10 16:27:55,310 FATAL org.apache.hadoop.mapred.TaskTracker:
    Error running child : java.lang.UnsatisfiedLinkError:
    org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(J)I
    at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(Native
    Method)
    at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.getRemaining(ZlibDecompressor.java:260)
    The patch modifies both the Java and the C code.  Your configuration obviously
    specifies native (C) code, but it looks like you didn't recompile and install
    the native libraries, so you're still using the stock version.  getRemaining()
    is a new method in the Decompressor interface.

    I'd suggest you first simply change your config to disable native libraries
    (hadoop.native.lib = false) and make sure your backported patch works OK.
    6835 supports both native and Java gzip.  Then build the native code ("ant
    -Dcompile.native=true mvn-install" or perhaps "ant -Dcompile.native=true jar"
    --I haven't tried the latter) and install it in the "appropriate location."
    (Sorry, I don't recall offhand where that is, but I'm pretty sure it's fully
    documented in the LZO docs, and I recall coming across a generic native-code
    twiki on the Apache Hadoop site somewhere, too.)  I think that's the only
    thing you're missing.

    Greg


    --
    ---------------------8<-------------------
    Keep up to date with our trip at http://lewisandjo.blogspot.com
  • Neil Ghosh at Sep 10, 2010 at 9:04 pm
    Hello ,

    I am new to Hadoop.Can anybody suggest any example or procedure of
    outputting TOP N items having maximum total count, where the input file has
    have (Item, count ) pair in each line .

    Items can repeat.

    Thanks
    Neil
    http://neilghosh.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 10, '10 at 3:55p
activeSep 10, '10 at 9:04p
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase