FAQ
After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Key: HADOOP-5367
URL: https://issues.apache.org/jira/browse/HADOOP-5367
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.19.1
Environment: State: RUNNING
Started: Fri Feb 27 17:00:07 CET 2009
Version: 0.19.1, r745977
Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley

Reporter: Thibaut
Priority: Critical


Hi,

After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
As possible solution I have to restart the mapreduce service.

I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.

Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.


Below is a short extract from related logfile. It's full of these kind of entries.

09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Thibaut (JIRA) at Feb 28, 2009 at 1:12 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Thibaut updated HADOOP-5367:
    ----------------------------

    Component/s: mapred
    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Thibaut (JIRA) at Feb 28, 2009 at 1:14 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Thibaut updated HADOOP-5367:
    ----------------------------

    Description:
    Hi,

    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.

    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.

    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.


    Below is a short extract from related logfile. It's full of these kind of entries.

    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1


    was:
    Hi,

    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.

    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.

    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.


    Below is a short extract from related logfile. It's full of these kind of entries.

    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1


    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sriramadasu (JIRA) at Mar 2, 2009 at 3:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677883#action_12677883 ]

    Amareshwari Sriramadasu commented on HADOOP-5367:
    -------------------------------------------------

    This could be mostly because of HADOOP-5269 and HADOOP-5235. There are committed to branch 0.19.2. Can you please try out 0.19.2 by checking out branch 0.19 ?
    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • schubert zhang (JIRA) at Mar 17, 2009 at 3:29 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682698#action_12682698 ]

    schubert zhang commented on HADOOP-5367:
    ----------------------------------------

    I also meet this such issue.

    After a long time running of MapReduce (about 200 jobs have completed). The MapReduce job is huaguped forever.
    (1) The JobTracker always logs:
    2009-03-17 16:29:39,997 INFO org.apache.hadoop.mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200903171247_0387_m_000015_1

    (2) The Job cannot complete and stopp at 79% forever.

    (3) All TaskTrackers may hungup at the sametime, since the logs of each TaskTracer stop at that time.


    And nefore the hangup. I can also find odds and ends such logs of JobTracker, such as.
    2009-03-17 16:29:21,767 INFO org.apache.hadoop.mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200903171247_0387_m_000015_1


    And another experience is:
    One time, I found the task slot cannot reach the capability, maybe some slot is also hungup.
    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • schubert zhang (JIRA) at Mar 27, 2009 at 5:12 am
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689809#action_12689809 ]

    schubert zhang commented on HADOOP-5367:
    ----------------------------------------

    I am using branch-0.19, it seems fine.
    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sriramadasu (JIRA) at Mar 30, 2009 at 6:34 am
    [ https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sriramadasu resolved HADOOP-5367.
    ---------------------------------------------

    Resolution: Fixed

    Got fixed in 0.19.2
    After some jobs have finished, Reducer will run new job's reduce tasks sequentially and not in parallel (mapred.JobTracker: Serious problem. While updating status, cannot find taskid...)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-5367
    URL: https://issues.apache.org/jira/browse/HADOOP-5367
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.19.1
    Environment: State: RUNNING
    Started: Fri Feb 27 17:00:07 CET 2009
    Version: 0.19.1, r745977
    Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
    Reporter: Thibaut
    Priority: Critical

    Hi,
    After I while, my cluster will only run the reduce tasks sequentially (each reducer running on the same node), the other nodes stay empty. The map phase however will run the jobs on all the nodes, also after such a "long" reduce phase has completed. But the reduce phase will then be again executed sequentially. This happens in my cluster after about 160 successfully completed jobs. (Some jobs have reducer set to 0!).
    As possible solution I have to restart the mapreduce service.
    I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because of the multipleoutput bug when setting reducers to 0.
    Anoter site node which might be related. I also tried running the jobs with speculative execution set to on. My cluster would always hold back one reducer and only run it (in multiple instances) after the first of the other 6 reducers had finished, instead of launching all of them at the same time.
    Below is a short extract from related logfile. It's full of these kind of entries.
    09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0105_r_000006_1
    09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0102_r_000006_1
    09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0051_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000002_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0083_r_000006_1
    09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating status, cannot find taskid attempt_200902271700_0041_r_000005_1
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 28, '09 at 1:10p
activeMar 30, '09 at 6:34a
posts7
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Amareshwari Sriramadasu (JIRA): 7 posts

People

Translate

site design / logo © 2023 Grokbase