I understand the basic distinction between a counter and a status even though both kinds of messages are sent to the reporter. I'm particularly fond of the status messages because they permit me to observe the online behavior of numerous mappers (or reducers) simultaneously in the job tracker. What I'm unclear about is which of these concepts (if either) is responsible for notifying the task tracker that the task is still alive and should not be killed after (ten I believe) minutes of inactivity. If I only send, say, status messages, will the task still be killed after ten minutes? If it is, in fact, a counter which is responsible for keeping the task alive (this has been my understanding and assumption so far), does it matter *which* counter I increment for the purpose of keeping the task alive or does any reporter:counter:a,b,c message keep the task tracker from killing the task?

As per my other post this morning, I am having serious problems keeping my tasks from being killed after ten minutes even though I am spawning a separate thread which does nothing except sleep for a minute and reporter counter and status messages forever. Since the same tasks eventually succeed on the second or third try, I know the code *basically* works. Otherwise none of the tasks would ever succeed I don't think. Thus my vexation on this issue.


Keith Wiley kwiley@keithwiley.com keithwiley.com music.keithwiley.com

"Luminous beings are we, not this crude matter."
-- Yoda

Search Discussions

  • Keith Wiley at Mar 23, 2011 at 6:58 pm
    I suppose this bit of documentation answers part of my question:

    'The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string.'

    So it's the status message, not any counter, that keeps the task alive. My original problem stands though, that the tasks are not actually starting properly for some reason. I'm definitely sending status messages on my separate reporter thread once a minute. When the task actually runs, I can verify this in the task logs because I duplicate the message to cerr for the purpose of logging/debugging...and I see the status updates on the jobtracker of course.

    Keith Wiley kwiley@keithwiley.com keithwiley.com music.keithwiley.com

    "The easy confidence with which I know another man's religion is folly teaches
    me to suspect that my own is also."
    -- Mark Twain

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
postedMar 23, '11 at 6:58p
activeMar 23, '11 at 7:06p

1 user in discussion

Keith Wiley: 2 posts



site design / logo © 2022 Grokbase