FAQ
Speculative execution used when property set to false
-----------------------------------------------------

Key: HADOOP-1542
URL: https://issues.apache.org/jira/browse/HADOOP-1542
Project: Hadoop
Issue Type: Bug
Components: mapred
Reporter: Nigel Daley


Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).

However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Nigel Daley (JIRA) at Jun 28, 2007 at 5:02 am
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-1542:
    --------------------------------

    Attachment: jobtracker.log

    Here's the jobtracker log. Follow the life cycle of task_0005_m_000004. This is a TestDFSIO map task that should be writing data.

    Note that task_0005_m_000004_1 is created right after task_0005_m_000004_0 even though speculative execution should be off. task_0005_m_000004_0 seems to complete fine (task_0005_m_000004_1 fails with AlreadyBeingCreatedException -- see namenode.log) but the file it creates (/benchmarks/TestDFSIO/io_data/test_io_12) seems to get lost.
    Speculative execution used when property set to false
    -----------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Nigel Daley
    Attachments: jobtracker.log


    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).
    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Jun 28, 2007 at 5:04 am
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-1542:
    --------------------------------

    Attachment: namenode.log

    Attaching namenode.log.
    Speculative execution used when property set to false
    -----------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Nigel Daley
    Attachments: jobtracker.log, namenode.log


    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).
    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Jun 28, 2007 at 5:22 am
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508733 ]

    Nigel Daley commented on HADOOP-1542:
    -------------------------------------

    Is it possible that setting speculation off in mapred-default.xml on my job submission host has no effect and that i really need to set it off in hadoop-site.xml?

    If that is the case, then speculation would be on -- which explains perfectly the immediate execution of task_0005_m_000004_1. In which case this should be filed as a dfs bug since the file created by the first map (task_0005_m_000004_0) is getting lost.
    Speculative execution used when property set to false
    -----------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Nigel Daley
    Attachments: jobtracker.log, namenode.log


    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).
    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jun 28, 2007 at 6:13 am
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508739 ]

    Devaraj Das commented on HADOOP-1542:
    -------------------------------------

    It looks like the JobTracker is ignoring the mapred-default.xml config items and hence the speculative execution setting in mapred-default.xml is not reflected. However, on the tasktrackers, the mapred-default.xml indeed overrides the config in hadoop-site/hadoop-default.xml, and hence sees speculative execution switched off.
    This could be the cause of the dfs file-lost problem. Here's the theory (not yet validated from the source code): When the maps tries to create files on dfs, they try to the create the "final" files (as opposed to the speculative case where the output path for the files would point to task specific directories). Hence the spec instance of a map gets the AlreadyBeingCreatedException. Finally the JT, which thinks that spec exec is turned on, tries to rename the empty file path to its final destination and that overwrites the real file that the task originally created.
    Speculative execution used when property set to false
    -----------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Nigel Daley
    Attachments: jobtracker.log, namenode.log


    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).
    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Jun 28, 2007 at 9:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-1542:
    ----------------------------------

    Assignee: Devaraj Das
    Priority: Blocker (was: Major)

    This seems like a blocker until we understand what is happening.
    Speculative execution used when property set to false
    -----------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Nigel Daley
    Assignee: Devaraj Das
    Priority: Blocker
    Attachments: jobtracker.log, namenode.log


    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).
    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Jun 28, 2007 at 10:48 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-1542:
    ----------------------------------

    Fix Version/s: 0.14.0
    Assignee: Owen O'Malley (was: Devaraj Das)
    Description:
    The change in HADOOP-1440 broke map/reduce by breaking the assumption that Task.getPartition() corresponded to the JobInProgress.map[] order.

    Currently JobInProgress.findNewTask uses Task.getPartition as the index of the map to run. This can be a completely different tip, which will cause incorrect tasks to be run, including duplicates of tasks that are already running.

    was:
    Speculative execution is now on by default. When running TestDFSIO, I set speculative execution off in my mapred-default.xml since this test has maps that create files in DFS (side-effects).

    However, it seems that speculative tasks get started even though I have set speculation off. I'll attached the NN and JT logs.

    Affects Version/s: 0.14.0
    Summary: Incorrect task/tip being scheduled (looks like speculative execution) (was: Speculative execution used when property set to false)
    Incorrect task/tip being scheduled (looks like speculative execution)
    ---------------------------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.14.0
    Reporter: Nigel Daley
    Assignee: Owen O'Malley
    Priority: Blocker
    Fix For: 0.14.0

    Attachments: jobtracker.log, namenode.log


    The change in HADOOP-1440 broke map/reduce by breaking the assumption that Task.getPartition() corresponded to the JobInProgress.map[] order.
    Currently JobInProgress.findNewTask uses Task.getPartition as the index of the map to run. This can be a completely different tip, which will cause incorrect tasks to be run, including duplicates of tasks that are already running.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Jun 29, 2007 at 5:22 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509159 ]

    Owen O'Malley commented on HADOOP-1542:
    ---------------------------------------

    For now, I've reverted HADOOP-1440, which fixes this problem.
    Incorrect task/tip being scheduled (looks like speculative execution)
    ---------------------------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.14.0
    Reporter: Nigel Daley
    Assignee: Owen O'Malley
    Priority: Blocker
    Fix For: 0.14.0

    Attachments: jobtracker.log, namenode.log


    The change in HADOOP-1440 broke map/reduce by breaking the assumption that Task.getPartition() corresponded to the JobInProgress.map[] order.
    Currently JobInProgress.findNewTask uses Task.getPartition as the index of the map to run. This can be a completely different tip, which will cause incorrect tasks to be run, including duplicates of tasks that are already running.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Jun 29, 2007 at 7:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley resolved HADOOP-1542.
    -----------------------------------

    Resolution: Fixed

    HADOOP-1440 needs to be re-created as a new issue, but this bug is fixed.
    Incorrect task/tip being scheduled (looks like speculative execution)
    ---------------------------------------------------------------------

    Key: HADOOP-1542
    URL: https://issues.apache.org/jira/browse/HADOOP-1542
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.14.0
    Reporter: Nigel Daley
    Assignee: Owen O'Malley
    Priority: Blocker
    Fix For: 0.14.0

    Attachments: jobtracker.log, namenode.log


    The change in HADOOP-1440 broke map/reduce by breaking the assumption that Task.getPartition() corresponded to the JobInProgress.map[] order.
    Currently JobInProgress.findNewTask uses Task.getPartition as the index of the map to run. This can be a completely different tip, which will cause incorrect tasks to be run, including duplicates of tasks that are already running.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 28, '07 at 4:53a
activeJun 29, '07 at 7:51p
posts9
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Owen O'Malley (JIRA): 9 posts

People

Translate

site design / logo © 2021 Grokbase