FAQ
Inconsistent state in JVM manager
---------------------------------

Key: HADOOP-4967
URL: https://issues.apache.org/jira/browse/HADOOP-4967
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Reporter: Amareshwari Sriramadasu


The following is the scenario:
-> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
-> TT received KillJobActtion. It did a runner.kill().
But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
It released the slots.
-> TaskRunner thread spawned jvm and populated jvmIdToRunner.
-> The next task getting launched on the slot finds it as busy.



--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Amareshwari Sriramadasu (JIRA) at Dec 31, 2008 at 10:53 am
    [ https://issues.apache.org/jira/browse/HADOOP-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660063#action_12660063 ]

    Amareshwari Sriramadasu commented on HADOOP-4967:
    -------------------------------------------------

    The logs for the above scenario:

    2008-12-31 06:12:11,255 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-700127208
    2008-12-31 06:12:11,266 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0001_m_001155_0
    2008-12-31 06:12:11,286 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-700127208 spawned.
    2008-12-31 06:12:11,305 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0001_m_001155_0
    2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200812310557_0001
    2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000317_0 done; removing files.
    2008-12-31 06:12:20,681 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000317_0 not found in cache
    2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 1
    2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001040_0 done; removing files.
    2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001040_0 not found in cache
    2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
    2008-12-31 06:12:20,686 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001155_0 done; removing files.
    2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001155_0 not found in cache
    2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000136_0 done; removing files.
    2008-12-31 06:12:20,691 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000136_0 not found in cache
    2008-12-31 06:12:20,692 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001040_0. Ignored.
    2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-1030395660
    2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-1030395660 spawned.
    2008-12-31 06:12:20,694 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001155_0. Ignored.
    2008-12-31 06:12:28,532 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000525_0 task's state:UNASSIGNED
    2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000562_0 task's state:UNASSIGNED
    2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000525_0
    2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_200812310557_0002_m_0
    00525_0
    2008-12-31 06:12:29,576 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-700127208 exited. Number of tasks it ran: 0
    2008-12-31 06:12:29,790 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_540025796 exited. Number of tasks it ran: 0
    2008-12-31 06:12:48,271 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000562_0
    2008-12-31 06:12:48,272 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0002_m_000562_0
    2008-12-31 06:12:48,290 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0002_m_-612394729
    2008-12-31 06:12:48,292 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0002_m_-612394729 spawned.
    2008-12-31 06:12:52,223 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0002_m_000562_0 Child Error
    java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an unstable state while reaping a JVM for task: attempt_200812310557_0002_m_000562_0 Number of active JVMs:2
    JVMId jvm_200812310557_0001_m_-1030395660 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0001_m_001155_0
    JVMId jvm_200812310557_0002_m_-612394729 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0002_m_000525_0
    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:253)
    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:114)
    at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:77)
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:411)
    2008-12-31 06:12:53,518 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-1030395660 exited. Number of tasks it ran: 0

    Inconsistent state in JVM manager
    ---------------------------------

    Key: HADOOP-4967
    URL: https://issues.apache.org/jira/browse/HADOOP-4967
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Amareshwari Sriramadasu

    The following is the scenario:
    -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
    -> TT received KillJobActtion. It did a runner.kill().
    But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
    It released the slots.
    -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
    -> The next task getting launched on the slot finds it as busy.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jan 12, 2009 at 6:35 am
    [ https://issues.apache.org/jira/browse/HADOOP-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-4967:
    --------------------------------

    Assignee: Devaraj Das

    Will upload a patch soon
    Inconsistent state in JVM manager
    ---------------------------------

    Key: HADOOP-4967
    URL: https://issues.apache.org/jira/browse/HADOOP-4967
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Amareshwari Sriramadasu
    Assignee: Devaraj Das

    The following is the scenario:
    -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
    -> TT received KillJobActtion. It did a runner.kill().
    But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
    It released the slots.
    -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
    -> The next task getting launched on the slot finds it as busy.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jan 19, 2009 at 8:08 am
    [ https://issues.apache.org/jira/browse/HADOOP-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-4967:
    --------------------------------

    Attachment: 4967.patch

    Straightforward patch..
    Inconsistent state in JVM manager
    ---------------------------------

    Key: HADOOP-4967
    URL: https://issues.apache.org/jira/browse/HADOOP-4967
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Amareshwari Sriramadasu
    Assignee: Devaraj Das
    Attachments: 4967.patch


    The following is the scenario:
    -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
    -> TT received KillJobActtion. It did a runner.kill().
    But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
    It released the slots.
    -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
    -> The next task getting launched on the slot finds it as busy.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sriramadasu (JIRA) at Jan 19, 2009 at 9:00 am
    [ https://issues.apache.org/jira/browse/HADOOP-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665066#action_12665066 ]

    Amareshwari Sriramadasu commented on HADOOP-4967:
    -------------------------------------------------

    Patch looks fine.

    Inconsistent state in JVM manager
    ---------------------------------

    Key: HADOOP-4967
    URL: https://issues.apache.org/jira/browse/HADOOP-4967
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Amareshwari Sriramadasu
    Assignee: Devaraj Das
    Attachments: 4967.patch


    The following is the scenario:
    -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
    -> TT received KillJobActtion. It did a runner.kill().
    But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
    It released the slots.
    -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
    -> The next task getting launched on the slot finds it as busy.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jan 19, 2009 at 6:12 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das resolved HADOOP-4967.
    ---------------------------------

    Resolution: Fixed
    Fix Version/s: 0.21.0
    0.20.0
    0.19.1
    Hadoop Flags: [Reviewed]

    I just committed this.
    Inconsistent state in JVM manager
    ---------------------------------

    Key: HADOOP-4967
    URL: https://issues.apache.org/jira/browse/HADOOP-4967
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Amareshwari Sriramadasu
    Assignee: Devaraj Das
    Fix For: 0.19.1, 0.20.0, 0.21.0

    Attachments: 4967.patch


    The following is the scenario:
    -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
    -> TT received KillJobActtion. It did a runner.kill().
    But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
    It released the slots.
    -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
    -> The next task getting launched on the slot finds it as busy.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 31, '08 at 10:49a
activeJan 19, '09 at 6:12p
posts6
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Devaraj Das (JIRA): 6 posts

People

Translate

site design / logo © 2022 Grokbase