FAQ
Hi,



I faced a problem when a volume configured in mapred.local.dir
fails, the tasktracker continuously trying to create directory
<checkLocalDirs()> and fails<even the main method throws exception
periodically due to getFreeSpace() call on the failed volume>. Eventually
all the running jobs are getting failed and new jobs cannot be executed.



Is tasktracker volume failure being handled similar to datanode
disk failure handling in HDFS-457?????



Thanks,

Gokul







****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

Search Discussions

  • Steve Loughran at Oct 26, 2010 at 10:23 am

    On 26/10/10 04:10, Gokulakannan M wrote:

    Hi,

    I faced a problem when a volume configured in *mapred.local.dir* fails,
    the tasktracker continuously trying to create directory
    <checkLocalDirs()> and fails<even the main method throws exception
    periodically due to getFreeSpace() call on the failed volume>.
    Eventually all the running jobs are getting failed and new jobs cannot
    be executed.
    I think you can provide a list of localdirs, in which case the TT would
    only fail if there is no free local volume with enough space
  • Gokulakannan M at Oct 26, 2010 at 12:36 pm
    Yes.. This is my scenario..

    I have one tasktracker... I configured 10 dirs(volumes)in
    mapred.local.dir<each of this is a separate volume mounted..even separate
    disks physically>..if one of the volume has bugs<in my case one physical
    harddisk is removed manually> , tasktracker is not executing further tasks..


    I remember in datanode, a similar scenario is handled.. when one of the
    volume fails, it will mark that volume as bad and proceed further..<Ref:
    HDFS-457>

    Is the similar fault tolerant feature available for tasktracker??? because
    only one of the "n" dirs has problem .. but it makes the TT to keep on
    retrying for that failed and not executing any tasks...

    -----Original Message-----
    From: Steve Loughran
    Sent: Tuesday, October 26, 2010 3:52 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Tasktracker volume failure...
    On 26/10/10 04:10, Gokulakannan M wrote:


    Hi,

    I faced a problem when a volume configured in *mapred.local.dir* fails,
    the tasktracker continuously trying to create directory
    <checkLocalDirs()> and fails<even the main method throws exception
    periodically due to getFreeSpace() call on the failed volume>.
    Eventually all the running jobs are getting failed and new jobs cannot
    be executed.
    I think you can provide a list of localdirs, in which case the TT would
    only fail if there is no free local volume with enough space

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 26, '10 at 3:10a
activeOct 26, '10 at 12:36p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Gokulakannan M: 2 posts Steve Loughran: 1 post

People

Translate

site design / logo © 2022 Grokbase