[ https://issues.apache.org/jira/browse/ZOOKEEPER-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338013#comment-15338013 ]

Chris Nauroth commented on ZOOKEEPER-2366:

[~fpj], thank you for the new patch.

This is a repeat of an earlier comment: This wasn't introduced by your patch, but it's generally a bad practice to catch an {{InterruptedException}} and neither propagate it nor re-raise the interrupted status. Other layers of code might be expecting to react to thread interruption. Since we're touching this code, do you think we should add a call to {{Thread.currentThread().interrupt()}}?

     public void reconfigure(InetSocketAddress addr) {
        Channel oldChannel = parentChannel;
        try {
            LOG.info("binding to port " + addr);
            parentChannel = bootstrap.bind(addr);
            localAddress = addr;
        } catch (Exception e) {
            LOG.error("Error while reconfiguring", e);
        } finally {

I don't see a checked exception raised in that code block. Is the catch block there in anticipation of an unchecked exception that you've seen?

Hmmm... this isn't directly related to your patch, but something seems odd here. The Netty bind and close calls are asynchronous, so when this method returns, it's not guaranteed that the bind and close have completed. I wonder if we have a race condition lurking.

A very minor nitpick:

             LOG.info("binding to port " + addr);

                 LOG.error("Error joining old acceptThread when reconfiguring client port " + e.getMessage());

             LOG.error("Error reconfiguring client port to " + addr + " " + e.getMessage());

            LOG.info("binding to port " + addr);

Since we're touching these lines of code anyway for the reformatting, let's take the opportunity to switch to the SLF4J calling convention with {} token substitutions.

[~shralex], would you also like to review the latest approach?
Reconfiguration of client port causes a socket leak

URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2366
Project: ZooKeeper
Issue Type: Bug
Components: quorum
Affects Versions: 3.5.0
Reporter: Timothy Ward
Assignee: Flavio Junqueira
Priority: Blocker
Fix For: 3.5.2

Attachments: ZOOKEEPER-2366.patch, ZOOKEEPER-2366.patch, ZOOKEEPER-2366.patch, zookeeper.patch

The NIOServerCnxnFactory reconfigure method can leak server sockets, and hence make ports unusable until the JVM restarts:
The first line of the method takes a reference to the current ServerSocketChannel and then the next line replaces it. The subsequent interactions with the server socket can fail (for example if the reconfiguration tries to bind to an in-use port). If they fail *before* the call to oldSS.close() then oldSS is *never* closed. This holds that port open forever, and prevents the user from rolling back to the previous port!
The code from reconfigure is shown below:
ServerSocketChannel oldSS = ss;
try {
this.ss = ServerSocketChannel.open();
LOG.info("binding to port " + addr);
try {
} catch (InterruptedException e) {
LOG.error("Error joining old acceptThread when reconfiguring client port " + e.getMessage());
acceptThread = new AcceptThread(ss, addr, selectorThreads);
} catch(IOException e) {
LOG.error("Error reconfiguring client port to " + addr + " " + e.getMessage());

This message was sent by Atlassian JIRA

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 5 | next ›
Discussion Overview
groupdev @
categorieszookeeper, hadoop
postedJun 17, '16 at 10:07a
activeJun 18, '16 at 5:07p



site design / logo © 2022 Grokbase