FAQ
TestReduceFetch failed.
-----------------------

Key: HADOOP-6029
URL: https://issues.apache.org/jira/browse/HADOOP-6029
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Reporter: Tsz Wo (Nicholas), SZE


{noformat}
Testcase: testReduceFromMem took 23.625 sec
FAILED
Non-zero read from local: 83
junit.framework.AssertionFailedError: Non-zero read from local: 83
at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
at junit.extensions.TestSetup.run(TestSetup.java:27)
{noformat}
Ran TestReduceFetch a few times on a clean trunk. It failed consistently.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Devaraj Das (JIRA) at Jun 16, 2009 at 1:23 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-6029:
    --------------------------------

    Attachment: TEST-org.apache.hadoop.mapred.TestReduceFetch.txt
    FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt

    Jothi and I came across another TestReduceFetch failure.
    {noformat}
    Testcase: testReduceFromDisk took 78.436 sec
    Testcase: testReduceFromPartialMem took 60.701 sec
    FAILED
    Expected at least 1MB fewer bytes read from local (21159650) than written to HDFS (21036680)
    junit.framework.AssertionFailedError: Expected at least 1MB fewer bytes read from local (21159650) than written to HDFS (21036680)
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromPartialMem(TestReduceFetch.java:276)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)

    Testcase: testReduceFromMem took 52.097 sec
    {noformat}

    The above failure actually looks like a memory issue. In ReduceTask.ReduceCopier.ShuffleRamManager, a memory reservation is done for in-memory shuffle, and that uses Runtime.getRuntime().maxMemory(). The return value of this seems to be machine-dependent. For the case where it failed with the exception trace above, the value returned by Runtime.maxMemory is smaller compared to the case using which the test passes. When the former happens, shuffled files start hitting the disk, and the testcase fails since it doesn't expect that many files to hit the disk.. I am attaching two logs - one of the successful testcase (all tests successful) and another for the failed testReduceFromPartialMem run. In both the logs, job_0002 is the job for the testReduceFromPartialMem test.

    Nicholas, could you please upload the logs of the test failure you saw. Thanks!
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Jun 16, 2009 at 10:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720392#action_12720392 ]

    Chris Douglas commented on HADOOP-6029:
    ---------------------------------------

    bq. In ReduceTask.ReduceCopier.ShuffleRamManager, a memory reservation is done for in-memory shuffle, and that uses Runtime.getRuntime().maxMemory(). The return value of this seems to be machine-dependent.

    This should be rigged by TestReduceFetch in mapred.child.java.opts, and match {{-Xmx128m}}.

    {{testReduceFromPartialMem}} is an awkward test to write. Its intent is to configure the reduce so that- presented with a set of crafted map outputs- it will make a particular guess about how to optimize its I/O. If we can't rig the total memory because \-Xmx has a machine-dependent interpretation, then writing such a test will be a real pain with our current set of configuration options. We could add a parameter for the memory reservation that defaults to querying the runtime; that would let us be certain of our memory limit, but not burden the user with setting it. I'm really surprised that setting \-Xmx doesn't work, though...

    The failure for {{testReduceFromMem}} suggests this should use the counters added in HADOOP-2774 rather than the FileSystem counters to validate the result.
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Jun 16, 2009 at 11:18 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-6029:
    -------------------------------------------

    Attachment: TEST-org.apache.hadoop.mapred.TestReduceFetch.txt
    Nicholas, could you please upload the logs of the test failure you saw. Thanks!
    Here you go: TEST-org.apache.hadoop.mapred.TestReduceFetch.txt
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jun 17, 2009 at 1:54 am
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720466#action_12720466 ]

    Devaraj Das commented on HADOOP-6029:
    -------------------------------------

    Nicholas, can you please upload the logs of the test where you see testReduceFromMem failing (as on the jira description).
    Chris, we did read somewhere that maxMemory() is not necessarily a function of -Xmx, and there are some quirks there. Let me try to get that link.
    Counters added by HADOOP-2774 looks like a better candidate for the TestReduceFetch use case.
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jothi Padmanabhan (JIRA) at Jun 17, 2009 at 3:42 am
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720484#action_12720484 ]

    Jothi Padmanabhan commented on HADOOP-6029:
    -------------------------------------------

    Here are a couple of links that explain -Xmx value and maxMemory() will be different.

    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4391499
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4686462

    A comment from the second link --

    " ... freeMemory() and totalMemory()
    report the amount of memory _inside_ the jvm while
    maxMemory() reports on the amount of memory _outside_ the
    jvm, i.e. the amount the whole jvm uses as seen from the OS."





    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jothi Padmanabhan (JIRA) at Jun 17, 2009 at 3:48 am
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720486#action_12720486 ]

    Jothi Padmanabhan commented on HADOOP-6029:
    -------------------------------------------

    bq. Nicholas, can you please upload the logs of the test where you see testReduceFromMem failing (as on the jira description).

    Just to clarify -- could you upload the logs where testReduceFromMem failed. We are able to get testReduceFromPartialMem fail consistently on some machines, but testReduceFromMem is passing.
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Jun 17, 2009 at 10:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-6029:
    -------------------------------------------

    Attachment: testReduceFromMem.txt
    Just to clarify - could you upload the logs where testReduceFromMem failed. We are able to get testReduceFromPartialMem fail consistently on some machines, but testReduceFromMem is passing.
    Oops, I missed this. Here is the log: testReduceFromMem.txt
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, testReduceFromMem.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Jun 18, 2009 at 2:38 am
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Chris Douglas updated HADOOP-6029:
    ----------------------------------

    Attachment: 6029-0.patch
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: 6029-0.patch, FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, testReduceFromMem.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jun 18, 2009 at 5:26 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721352#action_12721352 ]

    Devaraj Das commented on HADOOP-6029:
    -------------------------------------

    Although the fix looks right from the testcase point of view, but IMO we should still investigate why and where this 83 bytes from local disk are read (as logged in Nicholas's testReduceFromMem failure)
    TestReduceFetch failed.
    -----------------------

    Key: HADOOP-6029
    URL: https://issues.apache.org/jira/browse/HADOOP-6029
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Reporter: Tsz Wo (Nicholas), SZE
    Attachments: 6029-0.patch, FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, testReduceFromMem.txt


    {noformat}
    Testcase: testReduceFromMem took 23.625 sec
    FAILED
    Non-zero read from local: 83
    junit.framework.AssertionFailedError: Non-zero read from local: 83
    at org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    {noformat}
    Ran TestReduceFetch a few times on a clean trunk. It failed consistently.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 12, '09 at 7:33p
activeJun 18, '09 at 5:26p
posts10
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Devaraj Das (JIRA): 10 posts

People

Translate

site design / logo © 2022 Grokbase