|| at Dec 3, 2008 at 5:25 pm
There is a config variable (mapred.max.map.failures.percent) which says how much percentage of failure you can tolerate before marking the job as failed.
By default it is set to zero. Set this value to your desired percentage. Eg mapred.max.map.failures.percent =10 and if you have 100 map tasks, then you can have 10 map tasks fail without failing the job.
----- Original Message ----
From: "Zhou, Yunqing" <email@example.com>
Sent: Wednesday, December 3, 2008 5:49:57 AM
Subject: Can I ignore some errors in map step?
I'm running a job on a data with size 5TB. But currently it reports
there is a checksum error block in the file. Then it cause a map task
failure then the whole job failed.
But the lack of a 64MB block will almost not affect the final result.
So can I ignore some map task failure and continue with reduce step?
I'm using hadoop-0.18.2 with a replication factor of 1.