FAQ
Reuse output collectors across maps running on the same jvm
-----------------------------------------------------------

Key: HADOOP-5830
URL: https://issues.apache.org/jira/browse/HADOOP-5830
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Arun C Murthy


We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since:
# It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side (TaskTracker) and improves latency (terasort, HADOOP-1338, HADOOP-5223)
# Reduces seeks required for the TaskTracker to serve the map-outputs

So far we've had to manually tune applications to cut down the shuffle- crossbar by having fatter maps with custom input formats etc. For e.g. we saw a significant improvement while running the petasort when we went from ~800,000 maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,

The downsides are:
# The burden falls on the application-writer to tune this with custom input-formats etc.
# The naive method of using a higher min.split.size leads to considerable non-local i/o on the maps.

Given these, the proposal is to keep the 'output collector' open across jvm reuse for maps, there-by enabling 'combiners' across map-tasks. This would have the happy-effect of fixing both the above. The downsides are that it will add latency to jobs (since map-outputs cannot be shuffled till a few maps on the same jvm are done, then followed by a final sort/merge/combine) and the failure cases get a bit more complicated.

Thoughts? Lets discuss...

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Hong Tang (JIRA) at May 14, 2009 at 7:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709302#action_12709302 ]

    Hong Tang commented on HADOOP-5830:
    -----------------------------------

    To minimize the impact on the job latency, we may disable this when less than x% map finishes.
    Reuse output collectors across maps running on the same jvm
    -----------------------------------------------------------

    Key: HADOOP-5830
    URL: https://issues.apache.org/jira/browse/HADOOP-5830
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Arun C Murthy

    We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since:
    # It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side (TaskTracker) and improves latency (terasort, HADOOP-1338, HADOOP-5223)
    # Reduces seeks required for the TaskTracker to serve the map-outputs
    So far we've had to manually tune applications to cut down the shuffle- crossbar by having fatter maps with custom input formats etc. For e.g. we saw a significant improvement while running the petasort when we went from ~800,000 maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,
    The downsides are:
    # The burden falls on the application-writer to tune this with custom input-formats etc.
    # The naive method of using a higher min.split.size leads to considerable non-local i/o on the maps.
    Given these, the proposal is to keep the 'output collector' open across jvm reuse for maps, there-by enabling 'combiners' across map-tasks. This would have the happy-effect of fixing both the above. The downsides are that it will add latency to jobs (since map-outputs cannot be shuffled till a few maps on the same jvm are done, then followed by a final sort/merge/combine) and the failure cases get a bit more complicated.
    Thoughts? Lets discuss...
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Runping Qi (JIRA) at May 14, 2009 at 1:54 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709396#action_12709396 ]

    Runping Qi commented on HADOOP-5830:
    ------------------------------------


    I am glad this topic re-appeared:)
    Why not pursuing the ideas of having large logical map tasks as discussed in https://issues.apache.org/jira/browse/HADOOP-2560?

    Reuse output collectors across maps running on the same jvm
    -----------------------------------------------------------

    Key: HADOOP-5830
    URL: https://issues.apache.org/jira/browse/HADOOP-5830
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Arun C Murthy

    We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since:
    # It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side (TaskTracker) and improves latency (terasort, HADOOP-1338, HADOOP-5223)
    # Reduces seeks required for the TaskTracker to serve the map-outputs
    So far we've had to manually tune applications to cut down the shuffle- crossbar by having fatter maps with custom input formats etc. For e.g. we saw a significant improvement while running the petasort when we went from ~800,000 maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,
    The downsides are:
    # The burden falls on the application-writer to tune this with custom input-formats etc.
    # The naive method of using a higher min.split.size leads to considerable non-local i/o on the maps.
    Given these, the proposal is to keep the 'output collector' open across jvm reuse for maps, there-by enabling 'combiners' across map-tasks. This would have the happy-effect of fixing both the above. The downsides are that it will add latency to jobs (since map-outputs cannot be shuffled till a few maps on the same jvm are done, then followed by a final sort/merge/combine) and the failure cases get a bit more complicated.
    Thoughts? Lets discuss...
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hong Tang (JIRA) at May 14, 2009 at 4:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709456#action_12709456 ]

    Hong Tang commented on HADOOP-5830:
    -----------------------------------

    This is probably complementary to Hadoop-2560 (btw, is the new CombineFileInputFormat capable of picking blocks from different files and lump them into one input split?). I consider this approach is better because it could achieve better load balancing, and lower overhead of map failure or speculative execution.
    Reuse output collectors across maps running on the same jvm
    -----------------------------------------------------------

    Key: HADOOP-5830
    URL: https://issues.apache.org/jira/browse/HADOOP-5830
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Arun C Murthy

    We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since:
    # It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side (TaskTracker) and improves latency (terasort, HADOOP-1338, HADOOP-5223)
    # Reduces seeks required for the TaskTracker to serve the map-outputs
    So far we've had to manually tune applications to cut down the shuffle- crossbar by having fatter maps with custom input formats etc. For e.g. we saw a significant improvement while running the petasort when we went from ~800,000 maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,
    The downsides are:
    # The burden falls on the application-writer to tune this with custom input-formats etc.
    # The naive method of using a higher min.split.size leads to considerable non-local i/o on the maps.
    Given these, the proposal is to keep the 'output collector' open across jvm reuse for maps, there-by enabling 'combiners' across map-tasks. This would have the happy-effect of fixing both the above. The downsides are that it will add latency to jobs (since map-outputs cannot be shuffled till a few maps on the same jvm are done, then followed by a final sort/merge/combine) and the failure cases get a bit more complicated.
    Thoughts? Lets discuss...
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMay 14, '09 at 7:19a
activeMay 14, '09 at 4:17p
posts4
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hong Tang (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase