FAQ
I have a pig script. If I don't set any codec for Map output for hadoop cluster, no problem. Now I made the following compression settings, the job failed and the error message is shown below. I guess there are some other settings that should be correctly set together with using the compression. Im using 0.20.1. Any thoughts? Thanks for your help!

mapred-site.xml
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
</property>

error message of failed map task--->

java.io.IOException: Spill failed
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:822)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:648)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1135)



Thanks,

Michael

Search Discussions

  • Amogh Vasekar at Feb 23, 2010 at 5:28 am
    Hi,
    Can you please let us know what platform you are running on your hadoop machines?
    For gzip and lzo to work, you need supported hadoop native libraries ( I remember reading on this somewhere in hadoop wiki :) )

    Amogh


    On 2/23/10 8:16 AM, "jiang licht" wrote:

    I have a pig script. If I don't set any codec for Map output for hadoop cluster, no problem. Now I made the following compression settings, the job failed and the error message is shown below. I guess there are some other settings that should be correctly set together with using the compression. Im using 0.20.1. Any thoughts? Thanks for your help!

    mapred-site.xml
    <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
    </property>
    <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.GzipCodec</value>
    </property>

    error message of failed map task--->

    java.io.IOException: Spill failed
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:822)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:648)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1135)



    Thanks,

    Michael
  • Jiang licht at Feb 23, 2010 at 6:11 am
    Thanks Amogh. The platform that I got this error is mac os x and hadoop 0.20.1. All native library installed except lzo (which will report that codec not found). But I didn't see this error when I ran the same thing w/o expression specified, in addition I also ran sth with the same expression setting on Fedora 8 and 0.19.1 without any problem. So, I think it might depends on some other settings (wrt what spill is about).

    Thanks,

    Michael

    --- On Mon, 2/22/10, Amogh Vasekar wrote:

    From: Amogh Vasekar <amogh@yahoo-inc.com>
    Subject: Re: java.io.IOException: Spill failed when using w/ GzipCodec for Map output
    To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
    Date: Monday, February 22, 2010, 11:27 PM

    Hi,
    Can you please let us know what platform you are running on your hadoop machines?
    For gzip and lzo to work, you need supported hadoop native libraries ( I remember reading on this somewhere in hadoop wiki :) )

    Amogh


    On 2/23/10 8:16 AM, "jiang licht" wrote:

    I have a pig script. If I don't set any codec for Map output for hadoop cluster, no problem. Now I made the following compression settings, the job failed and the error message is shown below. I guess there are some other settings that should be correctly set together with using the compression. Im using 0.20.1. Any thoughts? Thanks for your help!

    mapred-site.xml
    <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
    </property>
    <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.GzipCodec</value>
    </property>

    error message of failed map task--->

    java.io.IOException: Spill failed
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:822)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:648)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1135)



    Thanks,

    Michael
  • Amogh Vasekar at Feb 23, 2010 at 7:46 am
    Hi,
    Certainly this might not cause the issue. But,
    "Hadoop native library is supported only on *nix platforms only. Unfortunately it is known not to work on Cygwin and Mac OS X and has mainly been used on the GNU/Linux platform."

    http://hadoop.apache.org/common/docs/current/native_libraries.html#Supported+Platforms

    The mapper log would throw more light on this

    Amogh


    On 2/23/10 11:41 AM, "jiang licht" wrote:

    Thanks Amogh. The platform that I got this error is mac os x and hadoop 0.20.1. All native library installed except lzo (which will report that codec not found). But I didn't see this error when I ran the same thing w/o expression specified, in addition I also ran sth with the same expression setting on Fedora 8 and 0.19.1 without any problem. So, I think it might depends on some other settings (wrt what spill is about).

    Thanks,

    Michael

    --- On Mon, 2/22/10, Amogh Vasekar wrote:

    From: Amogh Vasekar <amogh@yahoo-inc.com>
    Subject: Re: java.io.IOException: Spill failed when using w/ GzipCodec for Map output
    To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
    Date: Monday, February 22, 2010, 11:27 PM

    Hi,
    Can you please let us know what platform you are running on your hadoop machines?
    For gzip and lzo to work, you need supported hadoop native libraries ( I remember reading on this somewhere in hadoop wiki :) )

    Amogh


    On 2/23/10 8:16 AM, "jiang licht" wrote:

    I have a pig script. If I don't set any codec for Map output for hadoop cluster, no problem. Now I made the following compression settings, the job failed and the error message is shown below. I guess there are some other settings that should be correctly set together with using the compression. Im using 0.20.1. Any thoughts? Thanks for your help!

    mapred-site.xml
    <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
    </property>
    <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.GzipCodec</value>
    </property>

    error message of failed map task--->

    java.io.IOException: Spill failed
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:822)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:648)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1135)



    Thanks,

    Michael
  • Jiang licht at Feb 23, 2010 at 5:29 pm
    Thanks, Amogh. Good to know :)


    Michael

    --- On Tue, 2/23/10, Amogh Vasekar wrote:

    From: Amogh Vasekar <amogh@yahoo-inc.com>
    Subject: Re: java.io.IOException: Spill failed when using w/ GzipCodec for Map output
    To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
    Date: Tuesday, February 23, 2010, 1:45 AM

    Hi,
    Certainly this might not cause the issue. But,
    "Hadoop native library is supported only on *nix platforms only. Unfortunately it is known not to work on Cygwin    and Mac OS X   and has mainly been used on the  GNU/Linux platform."

    http://hadoop.apache.org/common/docs/current/native_libraries.html#Supported+Platforms

    The mapper log would throw more light on this

    Amogh


    On 2/23/10 11:41 AM, "jiang licht" wrote:

    Thanks Amogh. The platform that I got this error is mac os x and hadoop 0.20.1. All native library installed except lzo (which will report that codec not found). But I didn't see this error when I ran the same thing w/o expression specified, in addition I also ran sth with the same expression setting on Fedora 8 and 0.19.1 without any problem. So, I think it might depends on some other settings (wrt what spill is about).

    Thanks,

    Michael

    --- On Mon, 2/22/10, Amogh Vasekar wrote:

    From: Amogh Vasekar <amogh@yahoo-inc.com>
    Subject: Re: java.io.IOException: Spill failed when using w/ GzipCodec for Map output
    To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
    Date: Monday, February 22, 2010, 11:27 PM

    Hi,
    Can you please let us know what platform you are running on your hadoop machines?
    For gzip and lzo to work, you need supported hadoop native libraries ( I remember reading on this somewhere in hadoop wiki :) )

    Amogh


    On 2/23/10 8:16 AM, "jiang licht" wrote:

    I have a pig script. If I don't set any codec for Map output for hadoop cluster, no problem. Now I made the following compression settings, the job failed and the error message is shown below. I guess there are some other settings that should be correctly set together with using the compression. Im using 0.20.1. Any thoughts? Thanks for your help!

    mapred-site.xml
    <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
    </property>
    <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.GzipCodec</value>
    </property>

    error message of failed map task--->

    java.io.IOException: Spill failed
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:822)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:102)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1198)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:648)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1135)



    Thanks,

    Michael

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 23, '10 at 2:47a
activeFeb 23, '10 at 5:29p
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Jiang licht: 3 posts Amogh Vasekar: 2 posts

People

Translate

site design / logo © 2022 Grokbase