FAQ
Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
-----------------------------------------------------------------------------------------------------

Key: HADOOP-3050
URL: https://issues.apache.org/jira/browse/HADOOP-3050
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.17.0
Reporter: Konstantin Shvachko
Priority: Blocker


This happened during a test run by Hudson. So fortunately we have all logs present.
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
The issue needs to be investigated. I am making it a blocker until it is.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Nigel Daley (JIRA) at Mar 20, 2008 at 5:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580663#action_12580663 ]

    Nigel Daley commented on HADOOP-3050:
    -------------------------------------

    Konstantin, builds don't stay around forever on Hudson. I suggest to copy the relevant pieces into a text file and attach it to this issue.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Priority: Blocker

    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Shvachko (JIRA) at Mar 20, 2008 at 6:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Konstantin Shvachko updated HADOOP-3050:
    ----------------------------------------

    Attachment: FailedTestDecommission.log

    Here is the log of the failed test.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Priority: Blocker
    Attachments: FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Mar 27, 2008 at 10:40 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang reassigned HADOOP-3050:
    -------------------------------------

    Assignee: Hairong Kuang
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Mar 28, 2008 at 6:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-3050:
    ------------------------------------

    Component/s: dfs
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Mar 28, 2008 at 9:42 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583204#action_12583204 ]

    Hairong Kuang commented on HADOOP-3050:
    ---------------------------------------

    After examining the log, it looks that we got the following scenario:
    1. blk_167544198419718831 was replicated to datanode 1, datanode 2, and datanode 3;
    2. Datanode 1 lost contact with the namenode and datanode 2 is scheduled to be decomissioned.
    3. Datanode 1 reregistered with the namenode; but the block report came in before its network location was resolved; so its block report was dropped.
    4. Because the namenode does not know that datanode 1 has the blk_167544198419718831, it schedules to replicate the block to datanode 1 and datanode 4.
    5. The replication of the block failed because it already has the block.
    6. No additional block report was received until the end of the log. So the block replication kept on failing.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 1, 2008 at 9:41 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: blockReport.patch

    Looks that the problem is caused by the flag indicating if a block report is processed not seting to be false when a datanode re-registers. Therefore, the namenode does not ask for a block report when the datanode's network location is resolved.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: blockReport.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 1, 2008 at 10:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584344#action_12584344 ]

    Hairong Kuang commented on HADOOP-3050:
    ---------------------------------------

    I have run TestDecommission with the patch for 50 times in a row on my linux box without seeing any failure.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: blockReport.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 2, 2008 at 6:55 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: blockReport1.patch

    I ended up fixing more problems associated with sending block reports.
    1. Datanode does not send the inital block report until requested by the namenode;
    2. namenode asks a datanode to send a block report when the datanode's network location is resolved as a reply to a heartbeat;
    3. Add a static field R of type Random to DataNode and replace all the use of new Random() with R.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Attachments: blockReport.patch, blockReport1.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 2, 2008 at 11:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Affects Version/s: (was: 0.17.0)
    0.16.2
    Fix Version/s: 0.17.0
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.16.2
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 4:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: blockReport2.patch

    This patch makes sure that the initial block report is sent once and only once.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.16.2
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 5:03 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: (was: blockReport2.patch)
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.16.2
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 5:03 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: blockReport2.patch
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.16.2
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 9:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: blockReport2.patch
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 9:18 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Affects Version/s: (was: 0.16.2)
    0.17.0
    Status: Patch Available (was: Open)
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 3, 2008 at 9:18 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Attachment: (was: blockReport2.patch)
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Sanjay Radia (JIRA) at Apr 3, 2008 at 11:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585349#action_12585349 ]

    Sanjay Radia commented on HADOOP-3050:
    --------------------------------------

    The code is fine.
    +1
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 4, 2008 at 3:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585387#action_12585387 ]

    Hadoop QA commented on HADOOP-3050:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12379315/blockReport2.patch
    against trunk revision 643282.

    @author +1. The patch does not contain any @author tags.

    tests included -1. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2151/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2151/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2151/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2151/console

    This message is automatically generated.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 4, 2008 at 5:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585667#action_12585667 ]

    Hairong Kuang commented on HADOOP-3050:
    ---------------------------------------

    TestDecommision triggers this bug once a while. So no unit test is provided.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Apr 4, 2008 at 8:32 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-3050:
    --------------------------------

    Fix Version/s: (was: 0.17.0)
    0.16.3
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.16.3

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Apr 4, 2008 at 10:03 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3050:
    ----------------------------------

    Resolution: Fixed
    Fix Version/s: (was: 0.16.3)
    0.17.0
    Status: Resolved (was: Patch Available)

    I just committed the patch.
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Apr 5, 2008 at 12:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585980#action_12585980 ]

    Hudson commented on HADOOP-3050:
    --------------------------------

    Integrated in Hadoop-trunk #451 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/451/])
    Cluster fall into infinite loop trying to replicate a block to a target that aready has this replica.
    -----------------------------------------------------------------------------------------------------

    Key: HADOOP-3050
    URL: https://issues.apache.org/jira/browse/HADOOP-3050
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.17.0
    Reporter: Konstantin Shvachko
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: blockReport.patch, blockReport1.patch, blockReport2.patch, FailedTestDecommission.log


    This happened during a test run by Hudson. So fortunately we have all logs present.
    http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1987/console
    Search for TestDecommission. And look for block blk_167544198419718831 that is being replicated to node 127.0.0.1:65168 over and over again.
    The issue needs to be investigated. I am making it a blocker until it is.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 19, '08 at 8:02p
activeApr 5, '08 at 12:16p
posts22
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hudson (JIRA): 22 posts

People

Translate

site design / logo © 2022 Grokbase