I got the following error when I tried to do gzip compression on map output, using hadoop-0.20.1.
settings in mapred-site.xml-->
mapred.compress.map.output=true
mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
error message-->
java.lang.NullPointerException
at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1091)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
I read the src that Writer in IFile takes care of map output compression. So, it seems to me that I didn't have gzip native library built or didn't have correct settings. There is no "built" folder in HADOOP_HOME and no "native" in "lib" folder in HADOOP_HOME. I checked that I have gzip and zlib installed. So, next is to build hadoop native library on top of these. How to do that? Is it a simple matter of pointing some variable to gzip or zlib libs or should I use "build.xml" in hadoop to build some target, what target should I build?
Thanks,
Michael