FAQ
I have a Hadoop 0.16.4 cluster that effectively has no HDFS. It's
running a job analyzing data stored on a NAS type system mounted on
each tasktracker.

Unfortunately, the reducers task_200808062237_0031_r_000000_0 and
task_200808062237_0031_r_000000_1 are running simultaneously on the
same keys, redoing each other's work and stomping on each others
output files.

I've attached the JSP output that indicates this; let me know if you
need any other details.

Is this a configuration error, or is it a bug in Hadoop?

Cheers,
Anthony

Search Discussions

  • Lohit at Aug 12, 2008 at 4:57 am
    redoing each other's work and stomping on each others output files.
    I am assuming your tasks (reducers) are generating these files and these are not the output file like part-00000

    Looks like you have speculative execution turned on.
    hadoop tries to execute parallel attempts of map/reduce tasks if it finds out one of them is falling behind. All those task attempts are appended with a number as you can see _0 and _1.
    If you have tasks which generate files to common files, then you hit this problem.
    There are two ways out of this
    1. turn off speculative execution by setting mapred.speculative.execution to false
    2. if you are generating outputs, try to use taskID for unique attempt.
    I've attached the JSP output that indicates this; let me know if you
    need any other details.
    No attachement.

    Thanks,
    Lohit



    ----- Original Message ----
    From: Anthony Urso <anthony.urso@gmail.com>
    To: core-user@hadoop.apache.org
    Sent: Monday, August 11, 2008 7:03:45 PM
    Subject: Stopping two reducer tasks on two machines from working on the same keys?

    I have a Hadoop 0.16.4 cluster that effectively has no HDFS. It's
    running a job analyzing data stored on a NAS type system mounted on
    each tasktracker.

    Unfortunately, the reducers task_200808062237_0031_r_000000_0 and
    task_200808062237_0031_r_000000_1 are running simultaneously on the
    same keys, redoing each other's work and stomping on each others
    output files.

    I've attached the JSP output that indicates this; let me know if you
    need any other details.

    Is this a configuration error, or is it a bug in Hadoop?

    Cheers,
    Anthony
  • Anthony Urso at Aug 12, 2008 at 5:37 am
    That's got to be it
    On Mon, Aug 11, 2008 at 9:55 PM, lohit wrote:
    redoing each other's work and stomping on each others output files.
    I am assuming your tasks (reducers) are generating these files and these are not the output file like part-00000

    Looks like you have speculative execution turned on.
    hadoop tries to execute parallel attempts of map/reduce tasks if it finds out one of them is falling behind. All those task attempts are appended with a number as you can see _0 and _1.
    If you have tasks which generate files to common files, then you hit this problem.
    There are two ways out of this
    1. turn off speculative execution by setting mapred.speculative.execution to false
    2. if you are generating outputs, try to use taskID for unique attempt.
    I've attached the JSP output that indicates this; let me know if you
    need any other details.
    No attachement.
    I guess the listserv must have eaten it, as the one in my sent folder
    has it. It looks like this:

    Task Attempts Machine Status Progress Start Time Shuffle
    Finished Sort Finished Finish Time
    task_200808062237_0031_r_000000_0 snark-0002.liveoffice.com RUNNING 88.01% 11-Aug-2008
    14:11:13 11-Aug-2008 16:21:00 (2hrs, 9mins, 47sec) 11-Aug-2008
    16:21:00 (0sec)
    task_200808062237_0031_r_000000_1 snark-0005.liveoffice.com RUNNING 88.01%11-Aug-2008
    16:21:03 11-Aug-2008 16:21:04 (0sec) 11-Aug-2008 16:21:04 (0sec)



    Last 4KB
    Last 8KB
    All

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 12, '08 at 2:04a
activeAug 12, '08 at 5:37a
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Anthony Urso: 2 posts Lohit: 1 post

People

Translate

site design / logo © 2022 Grokbase