FAQ
I recently upgraded from Hadoop 0.14.1 to 0.16.1. Previously in 0.14.1,
if a map or reduce task threw a runtime exception such as an NPE, the
task, and ultimately the job, would fail in short order. I was running
on job on my local 0.16.1 cluster today, and when the reduce tasks
started throwing NPEs, the tasks just hung. Eventually they timed out
and were killed, but is this expected behavior in 0.16.1? I'd prefer the
job to fail quickly if NPEs are being thrown.

Matt

--
Matt Kent
Co-Founder
Persai
1221 40th St #113
Emeryville, CA 94608
matt@persai.com

Search Discussions

  • Chris Dyer at Mar 18, 2008 at 2:11 am
    I've noticed this behavior as well in 16.0 with RuntimeExceptions in general.

    Chris
    On Mon, Mar 17, 2008 at 6:14 PM, Matt Kent wrote:
    I recently upgraded from Hadoop 0.14.1 to 0.16.1. Previously in 0.14.1,
    if a map or reduce task threw a runtime exception such as an NPE, the
    task, and ultimately the job, would fail in short order. I was running
    on job on my local 0.16.1 cluster today, and when the reduce tasks
    started throwing NPEs, the tasks just hung. Eventually they timed out
    and were killed, but is this expected behavior in 0.16.1? I'd prefer the
    job to fail quickly if NPEs are being thrown.

    Matt

    --
    Matt Kent
    Co-Founder
    Persai
    1221 40th St #113
    Emeryville, CA 94608
    matt@persai.com
  • Konstantin Shvachko at Mar 18, 2008 at 3:18 am
    Usually a build takes 2 hours or less.
    This one is stuck and I don't see changes in the QUEUE OF PENDING PATCHES when I submit a patch.
    I guess something is wrong with Hadson.
    Could anybody please check.
    --Konstantin
  • Nigel Daley at Mar 18, 2008 at 4:29 am
    org.apache.hadoop.streaming.TestGzipInput was stuck. I killed it.

    Nige
    On Mar 17, 2008, at 8:17 PM, Konstantin Shvachko wrote:

    Usually a build takes 2 hours or less.
    This one is stuck and I don't see changes in the QUEUE OF PENDING
    PATCHES when I submit a patch.
    I guess something is wrong with Hadson.
    Could anybody please check.
    --Konstantin
  • Owen O'Malley at Mar 18, 2008 at 4:25 am

    On Mar 17, 2008, at 3:14 PM, Matt Kent wrote:

    I recently upgraded from Hadoop 0.14.1 to 0.16.1. Previously in
    0.14.1, if a map or reduce task threw a runtime exception such as
    an NPE, the task, and ultimately the job, would fail in short
    order. I was running on job on my local 0.16.1 cluster today, and
    when the reduce tasks started throwing NPEs, the tasks just hung.
    Eventually they timed out and were killed, but is this expected
    behavior in 0.16.1? I'd prefer the job to fail quickly if NPEs are
    being thrown.
    This sounds like a bug. Tasks should certainly fail immediately if an
    exception is thrown. Do you know where the exception is being thrown?
    Can you get a stack trace of the task from jstack after the exception
    and before the task times out?

    Thanks,
    Owen
  • Matt Kent at Mar 18, 2008 at 5:01 am
    It seems to happen only with reduce tasks, not map tasks. I reproduced
    it by having a dummy reduce task throw an NPE immediately. The error is
    shown on the reduce details page but the job does not register the task
    as failed. I've attached the task tracker stack trace, the child stack
    trace and a screenshot of the task list page.

    Matt

    Owen O'Malley wrote:
    On Mar 17, 2008, at 3:14 PM, Matt Kent wrote:

    I recently upgraded from Hadoop 0.14.1 to 0.16.1. Previously in
    0.14.1, if a map or reduce task threw a runtime exception such as an
    NPE, the task, and ultimately the job, would fail in short order. I
    was running on job on my local 0.16.1 cluster today, and when the
    reduce tasks started throwing NPEs, the tasks just hung. Eventually
    they timed out and were killed, but is this expected behavior in
    0.16.1? I'd prefer the job to fail quickly if NPEs are being thrown.
    This sounds like a bug. Tasks should certainly fail immediately if an
    exception is thrown. Do you know where the exception is being thrown?
    Can you get a stack trace of the task from jstack after the exception
    and before the task times out?

    Thanks,
    Owen

    --
    Matt Kent
    Co-Founder
    Persai
    1221 40th St #113
    Emeryville, CA 94608
    matt@persai.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 17, '08 at 10:14p
activeMar 18, '08 at 5:01a
posts6
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase