CompressionCodecFactory returns unconfigured GZipCodec if io.compression.codecs is not set

Key: HADOOP-7196
URL: https://issues.apache.org/jira/browse/HADOOP-7196
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Peter Voss

In case io.compression.codecs property is not set the GZipCodec is added using this code:
List<Class<? extends CompressionCodec>> codecClasses = getCodecClasses(conf);
if (codecClasses == null) {
addCodec(new GzipCodec());
addCodec(new DefaultCodec());
} else {
Iterator<Class<? extends CompressionCodec>> itr = codecClasses.iterator();
while (itr.hasNext()) {
CompressionCodec codec = ReflectionUtils.newInstance(itr.next(), conf);
which leaves GzipCodec unconfigured. If it is set via the {{io.compression.codecs}} property it gets configured properly using ReflectionUtils.newInstance(..., conf).

I have seen a lot of NPEs on systems that don't have this property set when using a LineRecordReader (that internally gets the codec from CompressionCodecFactory).

I would suggest to use {{org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec}} as default value for {{io.compression.codecs}}, instead of having another independent code path that deals with the case that this property is not set.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
postedMar 17, '11 at 10:22a
activeMar 17, '11 at 10:22a

1 user in discussion

Peter Voss (JIRA): 1 post



site design / logo © 2021 Grokbase