Hello.

We have rabbitMQ 2.7.1 java clients remotely connected to the server.
We started experience short-term bad network scenarios and serious problem
occurred:
1. factory.setRequestedHeartbeat set to 30s
2. factory.setConnectionTimeout set to 30000ms
client properly closes connection after missing 30 seconds of heartbeats.
But sometimes it hangs completely when tries to open a new connection.

I tried to analyze java client code and what is result:

AMQConnection.java:286 :
_frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout
is set to 10s here
then it starts the MainLoop at line 294
and blocks till get a reply for a handshake at line 300:
connStart =
(AMQP.Connection.Start)
connStartBlocker.getReply().getMethod();

problem is that it's possible that it'll never get a reply. Because
MainLoop relies on heartbeats functional to handle such situation which is
not enabled yet. It happens only at line 368:
setHeartbeat(heartbeat);
MainLoop endlessly runs at 492:
Frame frame = _frameHandler.readFrame();
which returns null every 10s (this is how SocketTimeoutException handled in
Frame.readFrom..)
and handleSocketTimeout() do nothing because _heartbeat is not set yet.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120207/2459d7d6/attachment.htm>

Search Discussions

  • Ильдар Нурисламов at Feb 13, 2012 at 9:28 am
    Can anybody help with this problem or prove that i'm wrong?

    2012/2/7 ?????? ?????????? <absorbb at gmail.com>
    Hello.

    We have rabbitMQ 2.7.1 java clients remotely connected to the server.
    We started experience short-term bad network scenarios and serious problem
    occurred:
    1. factory.setRequestedHeartbeat set to 30s
    2. factory.setConnectionTimeout set to 30000ms
    client properly closes connection after missing 30 seconds of heartbeats.
    But sometimes it hangs completely when tries to open a new connection.

    I tried to analyze java client code and what is result:

    AMQConnection.java:286 :
    _frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout
    is set to 10s here
    then it starts the MainLoop at line 294
    and blocks till get a reply for a handshake at line 300:
    connStart =
    (AMQP.Connection.Start)
    connStartBlocker.getReply().getMethod();

    problem is that it's possible that it'll never get a reply. Because
    MainLoop relies on heartbeats functional to handle such situation which is
    not enabled yet. It happens only at line 368:
    setHeartbeat(heartbeat);
    MainLoop endlessly runs at 492:
    Frame frame = _frameHandler.readFrame();
    which returns null every 10s (this is how SocketTimeoutException handled
    in Frame.readFrom..)
    and handleSocketTimeout() do nothing because _heartbeat is not set yet.

    Thanks.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120213/bd1519e0/attachment.htm>
  • Steve Powell at Feb 16, 2012 at 1:27 pm
    Dear Ildar,

    First, may I apologise for not getting back to you sooner. It seems that you
    have clearly identified a bug, and have helped to narrow it down for us.

    Thank you very much. I have raised a problem for us to fix and track this
    (24747).

    I have a few comments regarding your settings: it seems to me that a heartbeat
    of 30s is not unreasonable, but you should be aware that anything up to a minute
    may pass before noticing that a heartbeat is missed, so you must not rely on
    this interval.

    The ConnectionTimeout will only affect waiting for the socket connection so is
    not involved in this. I think your interval here is again quite large, but not
    unreasonable in unreliable networks. I would expect the herartbeat to be about
    half of this (see note above).

    We'll get on to this bug asap.
    Steve Powell
    steve at rabbitmq.com
    [wrk: +44-2380-111-528] [mob: +44-7815-838-558]
    On 13 Feb 2012, at 09:28, ?????? ?????????? wrote:

    Can anybody help with this problem or prove that i'm wrong?

    2012/2/7 ?????? ?????????? <absorbb at gmail.com>
    Hello.

    We have rabbitMQ 2.7.1 java clients remotely connected to the server.
    We started experience short-term bad network scenarios and serious problem occurred:
    1. factory.setRequestedHeartbeat set to 30s
    2. factory.setConnectionTimeout set to 30000ms
    client properly closes connection after missing 30 seconds of heartbeats.
    But sometimes it hangs completely when tries to open a new connection.

    I tried to analyze java client code and what is result:

    AMQConnection.java:286 :
    _frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout is set to 10s here
    then it starts the MainLoop at line 294
    and blocks till get a reply for a handshake at line 300:
    connStart =
    (AMQP.Connection.Start) connStartBlocker.getReply().getMethod();

    problem is that it's possible that it'll never get a reply. Because MainLoop relies on heartbeats functional to handle such situation which is not enabled yet. It happens only at line 368:
    setHeartbeat(heartbeat);
    MainLoop endlessly runs at 492:
    Frame frame = _frameHandler.readFrame();
    which returns null every 10s (this is how SocketTimeoutException handled in Frame.readFrom..)
    and handleSocketTimeout() do nothing because _heartbeat is not set yet.

    Thanks.

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Steve Powell at Feb 21, 2012 at 10:50 am
    Ildar,

    Just a note to confirm that this bug is now fixed and will be in the next release.

    Steve Powell (a funny bunny)
    ----------some more definitions from the SPD----------
    vermin (v.) Treating the dachshund for roundworm.
    chinchilla (n.) Cooling device for the lower jaw.
    socialcast (n.) Someone to whom everyone is speaking but nobody likes.
    On 16 Feb 2012, at 13:27, Steve Powell wrote:

    Dear Ildar,

    First, may I apologise for not getting back to you sooner. It seems that you
    have clearly identified a bug, and have helped to narrow it down for us.

    Thank you very much. I have raised a problem for us to fix and track this
    (24747).

    I have a few comments regarding your settings: it seems to me that a heartbeat
    of 30s is not unreasonable, but you should be aware that anything up to a minute
    may pass before noticing that a heartbeat is missed, so you must not rely on
    this interval.

    The ConnectionTimeout will only affect waiting for the socket connection so is
    not involved in this. I think your interval here is again quite large, but not
    unreasonable in unreliable networks. I would expect the herartbeat to be about
    half of this (see note above).

    We'll get on to this bug asap.
    Steve Powell
    steve at rabbitmq.com
    [wrk: +44-2380-111-528] [mob: +44-7815-838-558]
    On 13 Feb 2012, at 09:28, ?????? ?????????? wrote:

    Can anybody help with this problem or prove that i'm wrong?

    2012/2/7 ?????? ?????????? <absorbb at gmail.com>
    Hello.

    We have rabbitMQ 2.7.1 java clients remotely connected to the server.
    We started experience short-term bad network scenarios and serious problem occurred:
    1. factory.setRequestedHeartbeat set to 30s
    2. factory.setConnectionTimeout set to 30000ms
    client properly closes connection after missing 30 seconds of heartbeats.
    But sometimes it hangs completely when tries to open a new connection.

    I tried to analyze java client code and what is result:

    AMQConnection.java:286 :
    _frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout is set to 10s here
    then it starts the MainLoop at line 294
    and blocks till get a reply for a handshake at line 300:
    connStart =
    (AMQP.Connection.Start) connStartBlocker.getReply().getMethod();

    problem is that it's possible that it'll never get a reply. Because MainLoop relies on heartbeats functional to handle such situation which is not enabled yet. It happens only at line 368:
    setHeartbeat(heartbeat);
    MainLoop endlessly runs at 492:
    Frame frame = _frameHandler.readFrame();
    which returns null every 10s (this is how SocketTimeoutException handled in Frame.readFrom..)
    and handleSocketTimeout() do nothing because _heartbeat is not set yet.

    Thanks.

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedFeb 7, '12 at 6:02a
activeFeb 21, '12 at 10:50a
posts4
users2
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2022 Grokbase