FAQ
Namenode should synchronously resolve a datanode's network location when the datanode registers
-----------------------------------------------------------------------------------------------

Key: HADOOP-3620
URL: https://issues.apache.org/jira/browse/HADOOP-3620
Project: Hadoop Core
Issue Type: Improvement
Components: dfs
Affects Versions: 0.18.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Fix For: 0.19.0


Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • dhruba borthakur (JIRA) at Jun 23, 2008 at 6:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607341#action_12607341 ]

    dhruba borthakur commented on HADOOP-3620:
    ------------------------------------------

    It would be nice if the code can be organized in such a way that the FSnamesystem global lock is not held when the datanode's network location is resolved. Otherwise, cluster restart times could potentially take much much longer.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jun 24, 2008 at 12:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution.patch

    Dhruba's comment makes sense. The attached inital patch still let the thread hold the global lock while resolving a network location. I need to figure out how not to hod the lock without any risk of data structure inconsistency.

    The attached patch resolves a data node's network location when it registers. It also lets a data node to randomly back off its block report when the data node starts up.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 24, 2008 at 6:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607486#action_12607486 ]

    Amar Kamat commented on HADOOP-3620:
    ------------------------------------

    I think it also makes sense to do the same in the JobTracker. There too the resolution is async. With HADOOP-3590 getting fixed, the problem is that the tasks will be scheduled randomly until the node gets resolved. Some of the test cases assume that the all the TTs are resolved before the job gets submitted which might not be true always. Thoughts?
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Jun 24, 2008 at 6:24 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607487#action_12607487 ]

    Owen O'Malley commented on HADOOP-3620:
    ---------------------------------------

    I'm +1 to doing all of the resolution synchronously in both the namenode and jobtracker.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jun 24, 2008 at 6:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607423#action_12607423 ]

    hairong edited comment on HADOOP-3620 at 6/24/08 11:25 AM:
    -----------------------------------------------------------------

    Dhruba's comment makes sense. The attached inital patch still let the thread hold the global lock while resolving a network location. I need to figure out how not to hold the lock without any risk of data structure inconsistency.

    The attached patch resolves a data node's network location when it registers. It also lets a data node to randomly back off its block report when the data node starts up.

    was (Author: hairong):
    Dhruba's comment makes sense. The attached inital patch still let the thread hold the global lock while resolving a network location. I need to figure out how not to hod the lock without any risk of data structure inconsistency.

    The attached patch resolves a data node's network location when it registers. It also lets a data node to randomly back off its block report when the data node starts up.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jun 26, 2008 at 12:35 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution1.patch

    This patch additionally improves the performance of network resolution during registration by prersolving the network locations of every included host and storing them in a cache.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jun 27, 2008 at 12:53 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608759#action_12608759 ]

    Devaraj Das commented on HADOOP-3620:
    -------------------------------------

    I haven't gone through the patch in detail but one thing which struck me is that you removed the cache from ScriptBasedMapping and moved it to the dfs part of the framework. That might be a problem for MR part of the framework.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jun 30, 2008 at 11:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution2.patch

    Ok, this patch still keeps the cache in SriptBasedMapping. But pre-resolve datanodes' network locations only when script based mapping is used.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 9, 2008 at 11:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution3.patch

    This patch applies to trunk. Additionally
    1. It provides a cached implementation of DNSToSwitchMapping, which caches all resolved DNS to switch mapping. Any implementation could extend CachedDNSToSwitchMapping if this helps improve its performance.
    2. ScriptBasedMapping is CachedDNSToSwitchMapping while RawScriptBasedMapping resolves a location only by running a script.
    3. Name node pre-resolves the network location of all datanodes in the include list only if the configured DNSToSwitchMapping is cached.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jul 10, 2008 at 11:28 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612439#action_12612439 ]

    Devaraj Das commented on HADOOP-3620:
    -------------------------------------

    As far as I can see, the resolution happens with the global FSNamesystem lock held. Is that true? Dhruba had a concern there and I don't see a nice way to handle that either. The patch looks good otherwise (subject to hudson passing it and successfully running MapReduce jobs with this patch).
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 11, 2008 at 12:55 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Patch Available (was: Open)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 11, 2008 at 12:56 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612714#action_12612714 ]

    Hairong Kuang commented on HADOOP-3620:
    ---------------------------------------

    It's very hard to release the global lock during network resolution without breaking the data structure consistency. So what I did is to pre-resolve the network locations of datanodes that are in the include file. So the resolution during the registration is going to be fast.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 13, 2008 at 1:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613143#action_12613143 ]

    Hadoop QA commented on HADOOP-3620:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12385685/netResolution3.patch
    against trunk revision 676069.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2850/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2850/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2850/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2850/console

    This message is automatically generated.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 18, 2008 at 12:47 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Open (was: Patch Available)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 18, 2008 at 12:49 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution4.patch

    This patch fixed the javadoc warning.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 18, 2008 at 12:51 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Patch Available (was: Open)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 18, 2008 at 1:17 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614598#action_12614598 ]

    Raghu Angadi commented on HADOOP-3620:
    --------------------------------------

    One way to move the resolution out of global lock :

    resolveNetworkLocation does not strictly need DatanodeDescriptor.
    So regeisterDatanode() could look something like :
    {code}
    public void public synchronized void registerDatanode(DatanodeRegistration nodeReg)
    throws IOException {
    String networkLocation = resolveNeworklocation(nodeReg.getHost());
    internalRegisterDatanode(nodeReg, networkLocation); //holds global lock.
    }
    {code}
    Does the above work?

    Note that pre-resolving hosts in include file might not help start up since
    * resolving serially at the beginning still increases the start up time.
    * "normalizing host names" does multiple DNS resolves.

    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 18, 2008 at 1:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614598#action_12614598 ]

    rangadi edited comment on HADOOP-3620 at 7/17/08 6:16 PM:
    ---------------------------------------------------------------

    One way to move the resolution out of global lock :

    resolveNetworkLocation does not strictly need DatanodeDescriptor.
    So regeisterDatanode() could look something like :
    {code}
    public void public synchronized void registerDatanode(DatanodeRegistration nodeReg)
    throws IOException {
    String networkLocation = resolveNeworklocation(nodeReg.getHost());
    internalRegisterDatanode(nodeReg, networkLocation); //holds global lock.
    }
    {code}
    Does the above work?

    Note that pre-resolving hosts in include file might not help start up since
    * resolving serially at the beginning still increases the start up time.
    * "normalizing host names" does multiple DNS resolves.


    was (Author: rangadi):
    One way to move the resolution out of global lock :

    resolveNetworkLocation does not strictly need DatanodeDescriptor.
    So regeisterDatanode() could look something like :
    {code}
    public void public synchronized void registerDatanode(DatanodeRegistration nodeReg)
    throws IOException {
    String networkLocation = resolveNeworklocation(nodeReg.getHost());
    internalRegisterDatanode(nodeReg, networkLocation); //holds global lock.
    }
    {code}
    Does the above work?

    Note that pre-resolving hosts in include file might not help start up since
    * resolving serially at the beginning still increases the start up time.
    * "normalizing host names" does multiple DNS resolves.

    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 18, 2008 at 1:21 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614598#action_12614598 ]

    rangadi edited comment on HADOOP-3620 at 7/17/08 6:19 PM:
    ---------------------------------------------------------------

    One way to move the resolution out of global lock :

    resolveNetworkLocation does not strictly need DatanodeDescriptor.
    So regeisterDatanode() could look something like :
    {code}
    public void registerDatanode(DatanodeRegistration nodeReg) throws IOException {
    String networkLocation = resolveNeworklocation(nodeReg.getHost());
    internalRegisterDatanode(nodeReg, networkLocation); //holds global lock.
    }
    {code}
    Does the above work?

    Note that pre-resolving hosts in include file might not help start up since
    * resolving serially at the beginning still increases the start up time.
    * "normalizing host names" does multiple DNS resolves.


    was (Author: rangadi):
    One way to move the resolution out of global lock :

    resolveNetworkLocation does not strictly need DatanodeDescriptor.
    So regeisterDatanode() could look something like :
    {code}
    public void public synchronized void registerDatanode(DatanodeRegistration nodeReg)
    throws IOException {
    String networkLocation = resolveNeworklocation(nodeReg.getHost());
    internalRegisterDatanode(nodeReg, networkLocation); //holds global lock.
    }
    {code}
    Does the above work?

    Note that pre-resolving hosts in include file might not help start up since
    * resolving serially at the beginning still increases the start up time.
    * "normalizing host names" does multiple DNS resolves.

    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 18, 2008 at 9:46 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614682#action_12614682 ]

    Hadoop QA commented on HADOOP-3620:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386366/netResolution4.patch
    against trunk revision 677839.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2899/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2899/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2899/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2899/console

    This message is automatically generated.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 21, 2008 at 6:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615346#action_12615346 ]

    Hairong Kuang commented on HADOOP-3620:
    ---------------------------------------
    Does the above work?
    I thought about this solution too. But if you look at the registration code, sometimes there is no need to resolve a node's network location if the node has already registered. So network resolution in the front could be an overhead.
    Note that pre-resolving hosts in include file might not help start up.
    Pre-resolving should help since it resolves network locations in batch and therefore reducing the number of calls to the rack script dramatically.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 21, 2008 at 6:42 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615355#action_12615355 ]

    Raghu Angadi commented on HADOOP-3620:
    --------------------------------------
    So network resolution in the front could be an overhead.
    this may not be a problem since a DataNode would not re-register unless there is a real problem/bug. Not sure if we need to optimize that. Even if we want to, then we can make 'internalRegisterDatanode()' throw an exception to indicate that netwo needs to be resolved before calling it.

    I think doing this way will simplify the code and patch even further.
    Pre-resolving should help since it resolves network locations in batch and therefore reducing the number of calls to the rack script dramatically.
    Only if the script can do the resolutions in parallel. Does not default script make use of this? Also there are 2 DNS resolutions done for each host serially inside namenode to 'normalize' the host names, right? Also many installations may not specify include hosts.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 21, 2008 at 6:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615355#action_12615355 ]

    rangadi edited comment on HADOOP-3620 at 7/21/08 11:44 AM:
    ----------------------------------------------------------------
    So network resolution in the front could be an overhead.
    this may not be a problem since a DataNode would not re-register unless there is a real problem/bug. Not sure if we need to optimize that. Even if we want to, we can make 'internalRegisterDatanode()' throw an exception to indicate that netwo needs to be resolved before calling it.

    I think doing this way will simplify the code and patch even further.
    Pre-resolving should help since it resolves network locations in batch and therefore reducing the number of calls to the rack script dramatically.
    Only if the script can do the resolutions in parallel. Does the default script make use of this? Also there are 2 DNS resolutions done for each host serially inside namenode to 'normalize' the host names, right? In addition, many installations may not specify include hosts.

    was (Author: rangadi):
    So network resolution in the front could be an overhead.
    this may not be a problem since a DataNode would not re-register unless there is a real problem/bug. Not sure if we need to optimize that. Even if we want to, then we can make 'internalRegisterDatanode()' throw an exception to indicate that netwo needs to be resolved before calling it.

    I think doing this way will simplify the code and patch even further.
    Pre-resolving should help since it resolves network locations in batch and therefore reducing the number of calls to the rack script dramatically.
    Only if the script can do the resolutions in parallel. Does not default script make use of this? Also there are 2 DNS resolutions done for each host serially inside namenode to 'normalize' the host names, right? Also many installations may not specify include hosts.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 24, 2008 at 12:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Patch Available (was: Open)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 24, 2008 at 12:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Open (was: Patch Available)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 24, 2008 at 12:09 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution5.patch

    I talked with Raghu and understood his concern is that the number of calls to DNS resolution might impact the performance of network location resolution performance. From my experiment, this seems not a big concern. Instead, reducing the number of calls to the script would greatly improve the resolution performance.

    But this new patch reduces all possible calls to DNS resolution. It has all the following changes:
    1. Increase maxArg of ScriptBasedMapping from 20 to 100;
    2. CachedDNSToSwitchMap maps IP addresses to the network location;
    3. It allows include/exclude host files to contain ip addresses.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 24, 2008 at 1:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616479#action_12616479 ]

    Hadoop QA commented on HADOOP-3620:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12386770/netResolution5.patch
    against trunk revision 679286.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    -1 core tests. The patch failed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2936/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2936/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2936/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2936/console

    This message is automatically generated.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 29, 2008 at 12:37 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Attachment: netResolution6.patch

    Fixed the failed unit tests.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 29, 2008 at 12:38 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Open (was: Patch Available)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Jul 29, 2008 at 12:39 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Status: Patch Available (was: Open)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jul 29, 2008 at 2:50 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617665#action_12617665 ]

    Hadoop QA commented on HADOOP-3620:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12387066/netResolution6.patch
    against trunk revision 680577.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 6 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2965/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2965/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2965/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2965/console

    This message is automatically generated.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jul 29, 2008 at 10:52 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618004#action_12618004 ]

    Raghu Angadi commented on HADOOP-3620:
    --------------------------------------

    bq. But this new patch reduces all possible calls to DNS resolution. It has all the following changes: [...]

    +1 for the improvements.

    bq. I talked with Raghu and understood his concern is that the number of calls to DNS resolution might impact the performance of network location resolution performance. From my experiment, this seems not a big concern.

    But, I don't think DNS resultions is non-issue. see HADOOP-3694 for e.g. Its good that the latest patch reduces DNS resolutions.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jul 31, 2008 at 12:38 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618655#action_12618655 ]

    Devaraj Das commented on HADOOP-3620:
    -------------------------------------

    +1
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Aug 4, 2008 at 11:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang updated HADOOP-3620:
    ----------------------------------

    Resolution: Fixed
    Release Note: This patch makes the namenode to synchronously resolve a datanode's network location when the datanode registers. In addition, it allows and recommends an include host file to contain IP addresses.
    Hadoop Flags: [Reviewed]
    Status: Resolved (was: Patch Available)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hairong Kuang (JIRA) at Aug 4, 2008 at 11:37 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619741#action_12619741 ]

    Hairong Kuang commented on HADOOP-3620:
    ---------------------------------------

    I've just committed this.
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sriramadasu (JIRA) at Aug 5, 2008 at 4:35 am
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619782#action_12619782 ]

    Amareshwari Sriramadasu commented on HADOOP-3620:
    -------------------------------------------------

    I noticed that the new files are not committed. Trunk doesnt compile
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Aug 22, 2008 at 12:36 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624689#action_12624689 ]

    Hudson commented on HADOOP-3620:
    --------------------------------

    Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Oct 21, 2008 at 10:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-3620:
    ------------------------------------

    Release Note: (was: This patch makes the namenode to synchronously resolve a datanode's network location when the datanode registers. In addition, it allows and recommends an include host file to contain IP addresses.)
    Namenode should synchronously resolve a datanode's network location when the datanode registers
    -----------------------------------------------------------------------------------------------

    Key: HADOOP-3620
    URL: https://issues.apache.org/jira/browse/HADOOP-3620
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Affects Versions: 0.18.0
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Fix For: 0.19.0

    Attachments: netResolution.patch, netResolution1.patch, netResolution2.patch, netResolution3.patch, netResolution4.patch, netResolution5.patch, netResolution6.patch


    Release 0.18.0 removes the rpc timeout. So the namenode is ok to resolve a datanode's network location when the datanode registers. This could remove quite a lot of unnecessary code in both datanode and namenode to handle asynchronous network location resolution and avoid many potential bugs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 23, '08 at 6:19p
activeOct 21, '08 at 10:59p
posts39
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Robert Chansler (JIRA): 39 posts

People

Translate

site design / logo © 2022 Grokbase