FAQ
Improve block placement performance
-----------------------------------

Key: HADOOP-5603
URL: https://issues.apache.org/jira/browse/HADOOP-5603
Project: Hadoop Core
Issue Type: Improvement
Components: dfs
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Fix For: 0.21.0


ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Hairong Kuang (JIRA) at Mar 31, 2009 at 9:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Attachment: blockPlace.patch

    Here is the patch that made the suggested change.

    with the patch, both ReplicationTargetChooser#chooseRandom(int, String, List<Node>, long, int, List<DatanodeDescriptor>) and ReplicationTarget#chooseRandom(String, List<Node>, long, int, List<DatanodeDescriptor>) traverse every node in the given portion of the cluster map at most once in the worst case.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Mar 31, 2009 at 10:12 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694303#action_12694303 ]

    Hairong Kuang commented on HADOOP-5603:
    ---------------------------------------

    I did an experiment in a dfs cluster with 3150 node. The cluster is full with no space to place any block. The trunk takes around 6.5s to declare failure in an effort to place a block to 2 nodes. With the patch, it takes around 2.1s to declare failure.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2009 at 6:52 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695524#action_12695524 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-5603:
    ------------------------------------------------

    +1 patch looks good.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2009 at 10:28 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Hadoop Flags: [Reviewed]
    Status: Patch Available (was: Open)
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 3, 2009 at 11:00 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695634#action_12695634 ]

    Hadoop QA commented on HADOOP-5603:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12404270/blockPlace.patch
    against trunk revision 761632.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    -1 core tests. The patch failed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/113/console

    This message is automatically generated.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 6, 2009 at 8:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Status: Open (was: Patch Available)
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 6, 2009 at 8:53 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Status: Patch Available (was: Open)
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 6, 2009 at 8:53 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Attachment: blockPlace1.patch

    This patch fixed a bug that caused some unit tests to fail.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 6, 2009 at 9:36 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Attachment: blockPlace1.patch
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 6, 2009 at 9:38 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Attachment: (was: blockPlace1.patch)
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 7, 2009 at 2:40 am
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696355#action_12696355 ]

    Hadoop QA commented on HADOOP-5603:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12404772/blockPlace1.patch
    against trunk revision 762509.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/156/console

    This message is automatically generated.
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 7, 2009 at 5:42 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-5603:
    ----------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    -1 Eclipse classpath is caused by HADOOP-5518 not by this patch. The change is covered by existing test cases. There is no need of new tests.

    I've committed this!
    Improve block placement performance
    -----------------------------------

    Key: HADOOP-5603
    URL: https://issues.apache.org/jira/browse/HADOOP-5603
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.21.0

    Attachments: blockPlace.patch, blockPlace1.patch


    ReplicationTargetChooser chooses targets by iteratively selecting random nodes first and then filtering good targets until the required number of targets are chosen. This code may require selecting random nodes multiple times, thus introducing multiple traversals of the given portion of the cluster map. This code can be improved by traversing the portion of the cluster map only once.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 31, '09 at 9:19p
activeApr 7, '09 at 5:42p
posts13
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hairong Kuang (JIRA): 13 posts

People

Translate

site design / logo © 2022 Grokbase