FAQ
enabling BLOCK compression for map outputs breaks the reduce progress counters
------------------------------------------------------------------------------

Key: HADOOP-3131
URL: https://issues.apache.org/jira/browse/HADOOP-3131
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.16.1
Reporter: Colin Evans


Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Colin Evans (JIRA) at Mar 29, 2008 at 1:26 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Colin Evans updated HADOOP-3131:
    --------------------------------

    Attachment: Picture 1.png

    screenshot of broken progress counters
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Reporter: Colin Evans
    Attachments: Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jun 10, 2008 at 11:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Affects Version/s: (was: 0.16.1)
    0.18.0
    0.17.1
    0.17.0
    Status: Patch Available (was: Open)

    The problem was that SequenceFile.Sorter.MergeQueue calculates progress as (total size of keys and values read) / (total size of files to be merged on disk). When a file is compressed, the file size is much smaller than the combined sizes of the keys and values. In fact, there is also a problem when compression is turned off - the code returns progress less than 100% because it does not count bytes in the file that are not part of keys and values, such as the header and length fields. This patch changes MergeQueue to use the position in the input stream to calculate number of bytes read from disk and divide that by the total amount of data to be merged.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jun 10, 2008 at 11:36 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: merge-progress.patch

    Patch that changes progress computation to use position in input stream.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jun 11, 2008 at 12:04 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)

    Matei, unfortunately this patch won't work for hadoop-0.18.* since HADOOP-2095 changed Map-Reduce to use a new file format called IFile for intermediate sort/merge...
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Joydeep Sen Sarma (JIRA) at Jun 11, 2008 at 12:46 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604098#action_12604098 ]

    Joydeep Sen Sarma commented on HADOOP-3131:
    -------------------------------------------

    interesting. can a patch be submitted for 17.* as well? (generally seems useful and 17 has a few minor releases to come ..)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jun 11, 2008 at 6:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604294#action_12604294 ]

    Matei Zaharia commented on HADOOP-3131:
    ---------------------------------------

    Looking at the patch submitted for HADOOP-2095, it seems that it has the same problem (by doing totalBytesProcessed += (key.getLength()-key.getPosition()) +
    (value.getLength()-value.getPosition())). I can submit a separate patch against 18 to fix that, but it would also be good to place this in 17 because 18 is not getting released for a while.

    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jun 24, 2008 at 10:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607791#action_12607791 ]

    Matei Zaharia commented on HADOOP-3131:
    ---------------------------------------

    Is anyone still interested in me writing a patch for 0.18, or have you fixed this issue in IFile already?
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 9, 2008 at 5:21 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: merge-progress-trunk.patch

    Here is a patch that fixes this issue in trunk for SequenceFile. A similar patch is still needed for IFile.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 9, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: merge-progress-trunk.patch

    Here's a patch that fixes IFile too.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 9, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: (was: merge-progress-trunk.patch)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.18.0
    Reporter: Colin Evans
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 9, 2008 at 6:03 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Affects Version/s: 0.19.0
    0.17.2
    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 9, 2008 at 6:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Comment: was deleted
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 9, 2008 at 9:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy reassigned HADOOP-3131:
    -------------------------------------

    Assignee: Matei Zaharia
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 12, 2008 at 9:07 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613061#action_12613061 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12385650/merge-progress-trunk.patch
    against trunk revision 676069.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    -1 patch. The patch command could not apply the patch.

    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2835/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Attachments: merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 14, 2008 at 7:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: HADOOP-3131-v2.patch

    New patch against trunk to fix the previous merge conflict.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 14, 2008 at 7:05 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Description:
    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.

    This is problematic for speculative execution because it thinks the tasks are doing fine.

    was:Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.

    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 14, 2008 at 8:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Fix Version/s: 0.19.0
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 18, 2008 at 5:29 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 18, 2008 at 5:29 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 19, 2008 at 12:33 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614923#action_12614923 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386001/HADOOP-3131-v2.patch
    against trunk revision 677872.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2906/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2906/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2906/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2906/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 19, 2008 at 12:40 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: HADOOP-3131-v3.patch

    New patch fixing findbugs problem.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 19, 2008 at 12:43 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 19, 2008 at 12:43 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 19, 2008 at 9:34 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614970#action_12614970 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386447/HADOOP-3131-v3.patch
    against trunk revision 678080.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2912/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2912/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2912/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2912/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 21, 2008 at 10:13 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)

    Matei, am I missing something here or should IFile.Reader.getPosition return in.getPosition and not 'bytesRead' which is decompressed bytes' count?
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 21, 2008 at 10:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 21, 2008 at 10:52 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: HADOOP-3131-v4.patch

    Thanks for pointing that out.. here's a new patch that fixes it and includes a test with compression for IFile.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 22, 2008 at 1:20 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615488#action_12615488 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386583/HADOOP-3131-v4.patch
    against trunk revision 678593.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 8 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs. The patch appears to cause Findbugs to fail.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    -1 core tests. The patch failed core unit tests.

    -1 contrib tests. The patch failed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2919/testReport/
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2919/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2919/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 22, 2008 at 5:21 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: HADOOP-3131-v5.patch

    Fixes compile error.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 22, 2008 at 5:22 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 22, 2008 at 5:22 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 22, 2008 at 6:28 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615716#action_12615716 ]

    Arun C Murthy commented on HADOOP-3131:
    ---------------------------------------

    Matei, a minor nit: looking through your patch I realised that 'progress reporting' based on rawIn.getPosition might actually off due to the buffering done by IFile.Reader (see IFile.Reader.readData). Should we fix it too? (Of course, it's only a temporary glitch i.e. till the buffered data is consumed and won't be too bad...)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 22, 2008 at 6:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615720#action_12615720 ]

    Matei Zaharia commented on HADOOP-3131:
    ---------------------------------------

    I'm not sure it's worth fixing that, because we don't need perfect progress reporting, just a rough guide to tell whether a task is doing something, and what rate it's working at. With compression enabled, it would also be difficult to figure out which spot in the buffer corresponds to which byte in the compressed file, if we were to use the position in the buffer to figure out progress.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 22, 2008 at 7:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615748#action_12615748 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386638/HADOOP-3131-v5.patch
    against trunk revision 678692.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 8 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    -1 core tests. The patch failed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2921/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2921/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2921/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2921/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Jul 23, 2008 at 1:58 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur updated HADOOP-3131:
    -------------------------------------

    Status: Patch Available (was: Open)

    Resubmitting.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Jul 23, 2008 at 1:58 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur updated HADOOP-3131:
    -------------------------------------

    Status: Open (was: Patch Available)

    I do not think that the test failure has anything to do with this patch, but will resubmit the patch to make sure that this is true.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 23, 2008 at 3:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616122#action_12616122 ]

    Arun C Murthy commented on HADOOP-3131:
    ---------------------------------------

    I don't think the TestCLI failure has anything to do with this: HADOOP-3809 should address it.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 24, 2008 at 8:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616364#action_12616364 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386638/HADOOP-3131-v5.patch
    against trunk revision 679202.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 8 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    -1 core tests. The patch failed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2933/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2933/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2933/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2933/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 24, 2008 at 8:46 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy updated HADOOP-3131:
    ----------------------------------

    Status: Open (was: Patch Available)

    Matei, sorry I missed this piece the first time around:

    {noformat}
    + for (Segment<K, V> s: segmentsToMerge) {
    + totalBytesProcessed += s.getPosition(); // Count initial bytes read
    + }
    + if (totalBytes != 0) {
    + mergeProgress.set(totalBytesProcessed * progPerByte);
    + } else {
    + mergeProgress.set(1.0f);
    + }
    {noformat}

    At best it reports progress slightly early (i.e. before the final merge begins) and at worst it provides completely wrong progress value during the merging of intermediate map-outputs since all output for all reduces is in a single file. Hence {{s.getPosition}} is hopelessly off as a measure of merge progress... I vote we just do away with that block.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 24, 2008 at 6:31 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Attachment: HADOOP-3131-v6.patch

    New patch fixing issues that Arun pointed out.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 24, 2008 at 6:32 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-3131:
    ----------------------------------

    Status: Patch Available (was: Open)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 25, 2008 at 6:08 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616771#action_12616771 ]

    Hadoop QA commented on HADOOP-3131:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386816/HADOOP-3131-v6.patch
    against trunk revision 679601.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 8 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2946/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2946/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2946/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2946/console

    This message is automatically generated.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Jul 30, 2008 at 11:37 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618557#action_12618557 ]

    dhruba borthakur commented on HADOOP-3131:
    ------------------------------------------

    Hi arun, would you like to code-review this patch one final time? Thanks.
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at Jul 31, 2008 at 12:07 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy updated HADOOP-3131:
    ----------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    I just committed this. Thanks, Matei! (It's been a long-drawn affair, appreciate your patience!)
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jul 31, 2008 at 8:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618845#action_12618845 ]

    Matei Zaharia commented on HADOOP-3131:
    ---------------------------------------

    Awesome, thanks!
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Aug 22, 2008 at 12:41 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624789#action_12624789 ]

    Hudson commented on HADOOP-3131:
    --------------------------------

    Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Oct 22, 2008 at 12:49 am
    [ https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-3131:
    ------------------------------------

    Component/s: mapred
    enabling BLOCK compression for map outputs breaks the reduce progress counters
    ------------------------------------------------------------------------------

    Key: HADOOP-3131
    URL: https://issues.apache.org/jira/browse/HADOOP-3131
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.17.0, 0.17.1, 0.17.2, 0.18.0, 0.19.0
    Reporter: Colin Evans
    Assignee: Matei Zaharia
    Fix For: 0.19.0

    Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, HADOOP-3131-v6.patch, merge-progress-trunk.patch, merge-progress.patch, Picture 1.png


    Enabling map output compression and setting the compression type to BLOCK causes the progress counters during the reduce to go crazy and report progress counts over 100%.
    This is problematic for speculative execution because it thinks the tasks are doing fine.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 29, '08 at 1:24a
activeOct 22, '08 at 12:49a
posts48
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Robert Chansler (JIRA): 48 posts

People

Translate

site design / logo © 2022 Grokbase