FAQ
This is going to be a long email. I am seeing a race condition between
the renewing of leases on the DFS and RPC client timeout values and I
wanted to get some feedback on this.

We keep seeing leases failing to be renewed where a task has to start
over. Usually the task will complete fine but it still bugs me that the
leases were failing to be renewed. After doing some research here is
what I found. When the task is not renewed we see something like this
in the log:

06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease. Holder:
DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates: 0], leases
remaining: 10
06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc
06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from
an unknown lease holder DFSClient_task_0007_r_000001_0 for
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc

Tracking this back through the code, the only way it can get to this
position is if the lease is expired in the FSNamesystem.LeaseMonitor
thread. As far as I can tell the only ways for the lease to become
expired are one the system shuts down and the shutdown hook fires the
DFSClient.ClientFinalizer thread to release locks or two the
DFSClient.LeaseChecker thread doesn't renew the lease in time. The
system is not shutting down so it has to be something in the LeaseChecker.

The LeaseChecker continuously runs, once a second, and makes RPC calls
to renew the leases. The leases have to be renewed every 60 seconds
LEASE_PERIOD or they because expired and are removed by the
LeaseMonitor. The LeaseChecker renews leases when their renew time is
half over, so 30 seconds and beyond. So how is it that the leases
aren't getting renewed. Well the LeaseChecker actually is making an RPC
call to the NameNode to renew the lease. This RPC call uses an ipc
Client object which has a default socket timeout value of 10 seconds but
a configured value of 60 seconds in hadoop-default.xml.

The Client class uses a Connection object that has a thread which waits
for input from the SocketInputStream. So on the RPC method invocation
we write the output through the Connection sendParam and we wait for the
return Writable value through the input stream in the Connection
thread. This input stream will continually wait for input and the RPC
call will block while it is waiting. Now let's suppose that for some
reason the RPC call goes bad and blocks. It is going to wait 60 seconds
to go again. If the lease hasn't just been renewed then a race
condition can occur where the lease will expire and potentially be
removed by the LeaseMonitor before the RPC class times out and can be
refired.

I think a fix to this would be to either change the default timeout
value in the hadoop-default.xml which could have other unintended
consequences or to increase the LEASE_PERIOD to be something around 2
minutes. I am looking for some feedback on what everyone thinks.

Dennis Kubes

Search Discussions

  • Michael Stack at Nov 1, 2006 at 8:43 pm
    I have experienced the same condition.

    Here is the exception I was seeing:

    2006-09-17 22:29:08,798 INFO org.apache.hadoop.mapred.TaskTracker:
    task_0001_r_000032_3 0.7986655% reduce > reduce
    2006-09-17 22:29:27,112 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 06/09/17 22:29:27 WARN mapred.TaskTracker: Error
    running child
    2006-09-17 22:29:12,854 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 org.apache.hadoop.ipc.RemoteException:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/stack/nla/2005-outputs/segments/20060915202116-nla2005/crawl_fetch/part-00032/data
    2006-09-17 22:29:12,854 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:454)
    2006-09-17 22:29:12,854 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:228)
    2006-09-17 22:29:12,854 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    2006-09-17 22:29:12,854 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    2006-09-17 22:29:12,855 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at java.lang.reflect.Method.invoke(Method.java:585)
    2006-09-17 22:29:12,855 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:332)
    2006-09-17 22:29:12,855 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:468)
    2006-09-17 22:29:12,855 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3
    2006-09-17 22:29:12,855 INFO org.apache.hadoop.mapred.TaskRunner:
    task_0001_r_000032_3 at
    org.apache.hadoop.ipc.Client$Connection.run(Client.java:159)
    2006-09-17 22:29:13,673 INFO org.apache.hadoop.ipc.Server: Server
    listener on port 50050: disconnecting client 207.241.227.163. Number of
    active connections: 0

    After reading this mail from the archive,
    http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200607.mbox/%3C331ED54F-9FA7-48FE-A604-017CC54DA524@yahoo-inc.com%3E
    , I made following patch for my hadoop instances:

    Index: src/java/org/apache/hadoop/dfs/DFSClient.java
    ===================================================================
    --- src/java/org/apache/hadoop/dfs/DFSClient.java (revision 409788)
    +++ src/java/org/apache/hadoop/dfs/DFSClient.java (working copy)
    @@ -403,18 +434,23 @@
    public void run() {
    long lastRenewed = 0;
    while (running) {
    - if (System.currentTimeMillis() - lastRenewed >
    (LEASE_PERIOD / 2)) {
    + // Divide by 3 instead of by 2 so we start renewing earlier
    + // and set down "ipc.client.timeout" from its 60 to 20
    or 30.
    + // See this note for why:
    + //
    http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200607.mbox/%3C331ED54F-9FA7-48FE-A604-017CC54DA524@yahoo-inc.com%3E
    + if (System.currentTimeMillis() - lastRenewed >
    (LEASE_PERIOD / 3)) {

    Timing-wise, the patch and config. make it so we should get in at least
    one renewal attempt before lease timeout. This got rid of the lease
    expiration exceptions for me but the above is less than optimal.

    St.Ack









    Dennis Kubes wrote:
    This is going to be a long email. I am seeing a race condition
    between the renewing of leases on the DFS and RPC client timeout
    values and I wanted to get some feedback on this.

    We keep seeing leases failing to be renewed where a task has to start
    over. Usually the task will complete fine but it still bugs me that
    the leases were failing to be renewed. After doing some research here
    is what I found. When the task is not renewed we see something like
    this in the log:

    06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease.
    Holder: DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates:
    0], leases remaining: 10
    06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call
    error: org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call
    error: org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call
    error: org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call
    error: org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call
    error: org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc
    06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from
    an unknown lease holder DFSClient_task_0007_r_000001_0 for
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc

    Tracking this back through the code, the only way it can get to this
    position is if the lease is expired in the FSNamesystem.LeaseMonitor
    thread. As far as I can tell the only ways for the lease to become
    expired are one the system shuts down and the shutdown hook fires the
    DFSClient.ClientFinalizer thread to release locks or two the
    DFSClient.LeaseChecker thread doesn't renew the lease in time. The
    system is not shutting down so it has to be something in the
    LeaseChecker.

    The LeaseChecker continuously runs, once a second, and makes RPC calls
    to renew the leases. The leases have to be renewed every 60 seconds
    LEASE_PERIOD or they because expired and are removed by the
    LeaseMonitor. The LeaseChecker renews leases when their renew time is
    half over, so 30 seconds and beyond. So how is it that the leases
    aren't getting renewed. Well the LeaseChecker actually is making an
    RPC call to the NameNode to renew the lease. This RPC call uses an
    ipc Client object which has a default socket timeout value of 10
    seconds but a configured value of 60 seconds in hadoop-default.xml.
    The Client class uses a Connection object that has a thread which
    waits for input from the SocketInputStream. So on the RPC method
    invocation we write the output through the Connection sendParam and we
    wait for the return Writable value through the input stream in the
    Connection thread. This input stream will continually wait for input
    and the RPC call will block while it is waiting. Now let's suppose
    that for some reason the RPC call goes bad and blocks. It is going to
    wait 60 seconds to go again. If the lease hasn't just been renewed
    then a race condition can occur where the lease will expire and
    potentially be removed by the LeaseMonitor before the RPC class times
    out and can be refired.

    I think a fix to this would be to either change the default timeout
    value in the hadoop-default.xml which could have other unintended
    consequences or to increase the LEASE_PERIOD to be something around 2
    minutes. I am looking for some feedback on what everyone thinks.

    Dennis Kubes
  • Doug Cutting at Nov 1, 2006 at 9:06 pm

    Dennis Kubes wrote:
    This is going to be a long email. I am seeing a race condition between
    the renewing of leases on the DFS and RPC client timeout values and I
    wanted to get some feedback on this.
    Does https://issues.apache.org/jira/browse/HADOOP-563 fix this?

    Doug
  • Dhruba Borthakur at Nov 1, 2006 at 9:15 pm
    Hi Dennis,

    I agree with your scenario because of which the client may not get a chance
    to renew a lease within a 1 minute interval. However, this will occur only
    if an RPC goes "bad". You said that: "Now let's suppose that for some reason
    the RPC call goes bad and blocks. It is going to wait 60 seconds to go
    again". So, the main reason of these timeouts is because an RPC is stuck at
    the server for more than a minute. Any ideas why this is so?

    One cause could be that there is only one thread in thr DFSClient that
    renews leases. Suppose the client has to renew leases from many servers. If
    one of the servers is down, then the RPC to that server might get blocked,
    thus causing all other leases to other servers timeout too. Maybe the
    DFSClient can have a lease-renewal thread per server.

    Thanks,
    dhruba

    -----Original Message-----
    From: Dennis Kubes
    Sent: Wednesday, November 01, 2006 12:02 PM
    To: hadoop-user@lucene.apache.org
    Subject: Race Condition with Renewing Leases and RPC Calls

    This is going to be a long email. I am seeing a race condition between
    the renewing of leases on the DFS and RPC client timeout values and I
    wanted to get some feedback on this.

    We keep seeing leases failing to be renewed where a task has to start
    over. Usually the task will complete fine but it still bugs me that the
    leases were failing to be renewed. After doing some research here is
    what I found. When the task is not renewed we see something like this
    in the log:

    06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease. Holder:
    DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates: 0], leases
    remaining: 10
    06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc
    06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from
    an unknown lease holder DFSClient_task_0007_r_000001_0 for
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc

    Tracking this back through the code, the only way it can get to this
    position is if the lease is expired in the FSNamesystem.LeaseMonitor
    thread. As far as I can tell the only ways for the lease to become
    expired are one the system shuts down and the shutdown hook fires the
    DFSClient.ClientFinalizer thread to release locks or two the
    DFSClient.LeaseChecker thread doesn't renew the lease in time. The
    system is not shutting down so it has to be something in the LeaseChecker.

    The LeaseChecker continuously runs, once a second, and makes RPC calls
    to renew the leases. The leases have to be renewed every 60 seconds
    LEASE_PERIOD or they because expired and are removed by the
    LeaseMonitor. The LeaseChecker renews leases when their renew time is
    half over, so 30 seconds and beyond. So how is it that the leases
    aren't getting renewed. Well the LeaseChecker actually is making an RPC
    call to the NameNode to renew the lease. This RPC call uses an ipc
    Client object which has a default socket timeout value of 10 seconds but
    a configured value of 60 seconds in hadoop-default.xml.

    The Client class uses a Connection object that has a thread which waits
    for input from the SocketInputStream. So on the RPC method invocation
    we write the output through the Connection sendParam and we wait for the
    return Writable value through the input stream in the Connection
    thread. This input stream will continually wait for input and the RPC
    call will block while it is waiting. Now let's suppose that for some
    reason the RPC call goes bad and blocks. It is going to wait 60 seconds
    to go again. If the lease hasn't just been renewed then a race
    condition can occur where the lease will expire and potentially be
    removed by the LeaseMonitor before the RPC class times out and can be
    refired.

    I think a fix to this would be to either change the default timeout
    value in the hadoop-default.xml which could have other unintended
    consequences or to increase the LEASE_PERIOD to be something around 2
    minutes. I am looking for some feedback on what everyone thinks.

    Dennis Kubes
  • Dennis Kubes at Nov 1, 2006 at 10:15 pm
    Each FileSystem instance if in DFS mode uses a DistributedFileSystem
    implementation class which has one non-singleton instance of the
    DFSClient which has a LeaseChecker thread that monitors only its own
    lease which only communicates with the single NameNode. Because of this
    I don't see how the DFSClient would be renewing leases from multiple
    servers.

    I think the renew thread just runs into some type of temporary network
    problem like a dropped socket or too much network traffic and then has
    to wait to long and it expires on the NameNode. I also don't think it
    is the NameNode getting overloaded because we run our NameNode on its
    own server and the load factor on the box . 0.03 to 0.10.

    I think the HADOOP-563 patches will fix this though. I will try it and see.

    Dennis


    Dhruba Borthakur wrote:
    Hi Dennis,

    I agree with your scenario because of which the client may not get a chance
    to renew a lease within a 1 minute interval. However, this will occur only
    if an RPC goes "bad". You said that: "Now let's suppose that for some reason
    the RPC call goes bad and blocks. It is going to wait 60 seconds to go
    again". So, the main reason of these timeouts is because an RPC is stuck at
    the server for more than a minute. Any ideas why this is so?

    One cause could be that there is only one thread in thr DFSClient that
    renews leases. Suppose the client has to renew leases from many servers. If
    one of the servers is down, then the RPC to that server might get blocked,
    thus causing all other leases to other servers timeout too. Maybe the
    DFSClient can have a lease-renewal thread per server.

    Thanks,
    dhruba

    -----Original Message-----
    From: Dennis Kubes
    Sent: Wednesday, November 01, 2006 12:02 PM
    To: hadoop-user@lucene.apache.org
    Subject: Race Condition with Renewing Leases and RPC Calls

    This is going to be a long email. I am seeing a race condition between
    the renewing of leases on the DFS and RPC client timeout values and I
    wanted to get some feedback on this.

    We keep seeing leases failing to be renewed where a task has to start
    over. Usually the task will complete fine but it still bugs me that the
    leases were failing to be renewed. After doing some research here is
    what I found. When the task is not renewed we see something like this
    in the log:

    06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease. Holder:
    DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates: 0], leases
    remaining: 10
    06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/part-00001
    06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call error:
    org.apache.hadoop.dfs.LeaseExpiredException: No lease on
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc
    06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from
    an unknown lease holder DFSClient_task_0007_r_000001_0 for
    /user/phoenix/crawl/engine/unit2cat/.part-00001.crc

    Tracking this back through the code, the only way it can get to this
    position is if the lease is expired in the FSNamesystem.LeaseMonitor
    thread. As far as I can tell the only ways for the lease to become
    expired are one the system shuts down and the shutdown hook fires the
    DFSClient.ClientFinalizer thread to release locks or two the
    DFSClient.LeaseChecker thread doesn't renew the lease in time. The
    system is not shutting down so it has to be something in the LeaseChecker.

    The LeaseChecker continuously runs, once a second, and makes RPC calls
    to renew the leases. The leases have to be renewed every 60 seconds
    LEASE_PERIOD or they because expired and are removed by the
    LeaseMonitor. The LeaseChecker renews leases when their renew time is
    half over, so 30 seconds and beyond. So how is it that the leases
    aren't getting renewed. Well the LeaseChecker actually is making an RPC
    call to the NameNode to renew the lease. This RPC call uses an ipc
    Client object which has a default socket timeout value of 10 seconds but
    a configured value of 60 seconds in hadoop-default.xml.

    The Client class uses a Connection object that has a thread which waits
    for input from the SocketInputStream. So on the RPC method invocation
    we write the output through the Connection sendParam and we wait for the
    return Writable value through the input stream in the Connection
    thread. This input stream will continually wait for input and the RPC
    call will block while it is waiting. Now let's suppose that for some
    reason the RPC call goes bad and blocks. It is going to wait 60 seconds
    to go again. If the lease hasn't just been renewed then a race
    condition can occur where the lease will expire and potentially be
    removed by the LeaseMonitor before the RPC class times out and can be
    refired.

    I think a fix to this would be to either change the default timeout
    value in the hadoop-default.xml which could have other unintended
    consequences or to increase the LEASE_PERIOD to be something around 2
    minutes. I am looking for some feedback on what everyone thinks.

    Dennis Kubes

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 1, '06 at 8:02p
activeNov 1, '06 at 10:15p
posts5
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase