We have rabbitmq 2.4.1 on debian squeeze, erlang is R14A.

Currently we have a dozen or some queue and 2-3 exchanges with message
rates of about 500/sec to 2000/sec over all the queues. These are all
direct exchanges, but both the exchanges and queues are durable and
all publishing is persistent.

What we see is that the memory footprint of rabbitmq grows after
running for just a couple to a few hours. And eventually we get VM
alarms in the rabbitmq log. The funny thing is that we never queue up
more a 10-50 messages in any queue, usually in the single digits. So
the activity seems low, and the queue sizes are small, based on number
of messages.

We have seem this with 2.3.1 and then upgraded to 2.4.1 and see the
same thing.

With this level of usage, etc. this does not make sense. Any ideas.??

Thanks,
Mark.

- --
Principal Engineer
Cheyenne Software Engineering
mark.geib at echostar.com / 35-215

PGP fingerprint:6DFC 389D 9796 0188 92E5 58F5 34C5 6B47 D091 76FD

Search Discussions

  • Alexis Richardson at Apr 28, 2011 at 9:41 am
    Is there anything in the logs apart from VM alarms?

    Can you show us some client code please?

    On Thu, Apr 28, 2011 at 12:31 AM, Mark Geib wrote:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    We have rabbitmq 2.4.1 on debian squeeze, erlang is R14A.

    Currently we have a dozen or some queue and 2-3 exchanges with message
    rates of about 500/sec to 2000/sec over all the queues. These are all
    direct exchanges, but both the exchanges and queues are durable and
    all publishing is persistent.

    What we see is that the memory footprint of rabbitmq grows after
    running for just a couple to a few hours. And eventually we get VM
    alarms in the rabbitmq log. The funny thing is that we never queue up
    more a 10-50 messages in any queue, usually in the single digits. So
    the activity seems low, and the queue sizes are small, based on number
    of messages.

    We have seem this with 2.3.1 and then upgraded to 2.4.1 and see the
    same thing.

    With this level of usage, etc. this does not make sense. Any ideas.??

    Thanks,
    Mark.

    - --
    Principal Engineer
    Cheyenne Software Engineering
    mark.geib at echostar.com / 35-215

    PGP fingerprint:6DFC 389D 9796 0188 92E5 58F5 34C5 6B47 D091 76FD
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.10 (GNU/Linux)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iEYEARECAAYFAk24p0sACgkQNMVrR9CRdv0kCQCfVya+xhlE6cN6MCdICPjzdWJq
    kLQAnjgJQ+skYHgfirUatLLZxjjVhroc
    =HBcw
    -----END PGP SIGNATURE-----


    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Matthias Radestock at Apr 28, 2011 at 9:48 am
    Mark,

    Alexis Richardson wrote:
    Is there anything in the logs apart from VM alarms?

    Can you show us some client code please?
    ...and when you say "we never queue up more a 10-50 messages in any
    queue", did you check with 'rabbitmqctl list_queues' that there really
    are no more than 10-50 messages in any of your queues, and that there
    are no stale queues? Also, how large are the messages?

    Regards,

    Matthias.
  • Matthew Sackman at Apr 28, 2011 at 12:28 pm

    On Thu, Apr 28, 2011 at 10:48:44AM +0100, Matthias Radestock wrote:
    Mark,

    Alexis Richardson wrote:
    Is there anything in the logs apart from VM alarms?

    Can you show us some client code please?
    ...and when you say "we never queue up more a 10-50 messages in any
    queue", did you check with 'rabbitmqctl list_queues' that there
    really are no more than 10-50 messages in any of your queues, and
    that there are no stale queues? Also, how large are the messages?
    Oh, and are your clients actually acking msgs properly? (rabbitmqctl
    list_queues name messages messages_unacknowledged)

    Matthew
  • Matthew Sackman at Apr 28, 2011 at 12:27 pm
    Hi Mark,
    On Wed, Apr 27, 2011 at 05:31:23PM -0600, Mark Geib wrote:
    We have rabbitmq 2.4.1 on debian squeeze, erlang is R14A.
    Yes R14A. Which is an unstable beta release of Erlang
    (http://www.erlang.org/download_release/7) and should _never_ have been
    selected as the version of Erlang for a stable release of Debian. Sigh.
    What we see is that the memory footprint of rabbitmq grows after
    running for just a couple to a few hours. And eventually we get VM
    alarms in the rabbitmq log. The funny thing is that we never queue up
    more a 10-50 messages in any queue, usually in the single digits. So
    the activity seems low, and the queue sizes are small, based on number
    of messages.

    We have seem this with 2.3.1 and then upgraded to 2.4.1 and see the
    same thing.
    Hmm, I'd expect some growth due to memory fragmentation but not to the
    extent you're suggesting. The connections to the server, they're not SSL
    connections are they? Also what is the memory in the server, and have
    you left the high watermark at its default 0.4?

    Matthew
  • Aaron Westendorf at Apr 28, 2011 at 12:34 pm
    Are you running the management plugin? Specifically the web
    interface? We haven't had trouble with the agents, but we observed the
    same pattern you're describing on a node that was running the web
    stack, with similar traffic loads. Haven't had time to write up a bug
    report yet but I mentioned it in another thread.

    -Aaron
    On Wed, Apr 27, 2011 at 7:31 PM, Mark Geib wrote:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    We have rabbitmq 2.4.1 on debian squeeze, erlang is R14A.

    Currently we have a dozen or some queue and 2-3 exchanges with message
    rates of about 500/sec to 2000/sec over all the queues. These are all
    direct exchanges, but both the exchanges and queues are durable and
    all publishing is persistent.

    What we see is that the memory footprint of rabbitmq grows after
    running for just a couple to a few hours. And eventually we get VM
    alarms in the rabbitmq log. The funny thing is that we never queue up
    more a 10-50 messages in any queue, usually in the single digits. So
    the activity seems low, and the queue sizes are small, based on number
    of messages.

    We have seem this with 2.3.1 and then upgraded to 2.4.1 and see the
    same thing.

    With this level of usage, etc. this does not make sense. Any ideas.??

    Thanks,
    Mark.

    - --
    Principal Engineer
    Cheyenne Software Engineering
    mark.geib at echostar.com / 35-215

    PGP fingerprint:6DFC 389D 9796 0188 92E5 58F5 34C5 6B47 D091 76FD
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.10 (GNU/Linux)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iEYEARECAAYFAk24p0sACgkQNMVrR9CRdv0kCQCfVya+xhlE6cN6MCdICPjzdWJq
    kLQAnjgJQ+skYHgfirUatLLZxjjVhroc
    =HBcw
    -----END PGP SIGNATURE-----


    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


    --
    Aaron Westendorf
    Senior Software Engineer
    Agora Games
    359 Broadway
    Troy, NY 12180
    Phone: 518.268.1000
    aaron at agoragames.com
    www.agoragames.com
  • Gavin M. Roy at May 2, 2011 at 6:32 pm
    In checking my cluster, I am seeing this too. Looks like it pre-existed
    2.4.1, the drop at week 16 is an upgrade from 2.3.1 to 2.4.1:

    [image: Memory.png]

    The management plugin is reporting RAM usage for this node at 330MB and
    Rabbit is the only service running on the box. Anyone have some ideas on
    how to try and figure out where the leak is?

    Gavin

    On Thu, Apr 28, 2011 at 8:34 AM, Aaron Westendorf wrote:

    Are you running the management plugin? Specifically the web
    interface? We haven't had trouble with the agents, but we observed the
    same pattern you're describing on a node that was running the web
    stack, with similar traffic loads. Haven't had time to write up a bug
    report yet but I mentioned it in another thread.

    -Aaron
    On Wed, Apr 27, 2011 at 7:31 PM, Mark Geib wrote:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    We have rabbitmq 2.4.1 on debian squeeze, erlang is R14A.

    Currently we have a dozen or some queue and 2-3 exchanges with message
    rates of about 500/sec to 2000/sec over all the queues. These are all
    direct exchanges, but both the exchanges and queues are durable and
    all publishing is persistent.

    What we see is that the memory footprint of rabbitmq grows after
    running for just a couple to a few hours. And eventually we get VM
    alarms in the rabbitmq log. The funny thing is that we never queue up
    more a 10-50 messages in any queue, usually in the single digits. So
    the activity seems low, and the queue sizes are small, based on number
    of messages.

    We have seem this with 2.3.1 and then upgraded to 2.4.1 and see the
    same thing.

    With this level of usage, etc. this does not make sense. Any ideas.??

    Thanks,
    Mark.

    - --
    Principal Engineer
    Cheyenne Software Engineering
    mark.geib at echostar.com / 35-215

    PGP fingerprint:6DFC 389D 9796 0188 92E5 58F5 34C5 6B47 D091 76FD
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.10 (GNU/Linux)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iEYEARECAAYFAk24p0sACgkQNMVrR9CRdv0kCQCfVya+xhlE6cN6MCdICPjzdWJq
    kLQAnjgJQ+skYHgfirUatLLZxjjVhroc
    =HBcw
    -----END PGP SIGNATURE-----


    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


    --
    Aaron Westendorf
    Senior Software Engineer
    Agora Games
    359 Broadway
    Troy, NY 12180
    Phone: 518.268.1000
    aaron at agoragames.com
    www.agoragames.com
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110502/c8a8a6f7/attachment-0001.htm>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: image/png
    Size: 20393 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110502/c8a8a6f7/attachment-0001.png>
  • Matthias Radestock at May 2, 2011 at 6:48 pm
    Gavin,

    Gavin M. Roy wrote:
    In checking my cluster, I am seeing this too. Looks like it pre-existed
    2.4.1, the drop at week 16 is an upgrade from 2.3.1 to 2.4.1:

    Memory.png


    The management plugin is reporting RAM usage for this node at 330MB and
    Rabbit is the only service running on the box. Anyone have some ideas
    on how to try and figure out where the leak is?
    The above graph shows system memory usage. It is perfectly normal for a
    non-idle linux system to gradually fill up all the memory with cached
    files. And, as you say, rabbit thinks it is only using 330MB. Is the
    rabbit Erlang process considerably bigger than that?

    Mark reported that he was seeing memory alarms in the logs, which only
    happens when rabbit itself thinks it has exceeded the watermark.

    Regards,

    Matthias.
  • Gavin M. Roy at May 2, 2011 at 7:12 pm

    On Monday, May 2, 2011 at 2:48 PM, Matthias Radestock wrote:
    Gavin,

    The above graph shows system memory usage. It is perfectly normal for a
    non-idle linux system to gradually fill up all the memory with cached
    files.
    Which I expect in disk buffers in that graph. The inactive memory is what threw me, going back and re-reading up on it, this is memory that previously been allocated in vm that can be reclaimed for other use, correct? I had read, I thought, that it was allocated and yet to be freed memory that was not actively being used.
    And, as you say, rabbit thinks it is only using 330MB. Is the
    rabbit Erlang process considerably bigger than that?
    No, it is in that range.

    Gavin
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110502/acf3f34e/attachment.htm>
  • Matthias Radestock at May 2, 2011 at 7:29 pm
    Gavin,

    Gavin M. Roy wrote:
    On Monday, May 2, 2011 at 2:48 PM, Matthias Radestock wrote:
    The above graph shows system memory usage. It is perfectly normal for a
    non-idle linux system to gradually fill up all the memory with cached
    files.
    Which I expect in disk buffers in that graph. The inactive memory is
    what threw me, going back and re-reading up on it, this is memory that
    previously been allocated in vm that can be reclaimed for other use,
    correct?
    I have no idea. Consult your resident Linux expert ;)

    As long as the system is not swapping it is probably fine. Have you ever
    seen it swap?
    And, as you say, rabbit thinks it is only using 330MB. Is the
    rabbit Erlang process considerably bigger than that?
    No, it is in that range.
    In which case it's clearly not rabbit that is using the memory. So don't
    blame the poor bunny.


    Regards,

    Matthias.
  • Daniel Maher at May 3, 2011 at 7:46 am

    On Mon, 2011-05-02 at 20:29 +0100, Matthias Radestock wrote:
    Gavin M. Roy wrote:
    On Monday, May 2, 2011 at 2:48 PM, Matthias Radestock wrote:
    The above graph shows system memory usage. It is perfectly normal for a
    non-idle linux system to gradually fill up all the memory with cached
    files.
    Which I expect in disk buffers in that graph. The inactive memory is
    what threw me, going back and re-reading up on it, this is memory that
    previously been allocated in vm that can be reclaimed for other use,
    correct?
    I have no idea. Consult your resident Linux expert ;)

    As long as the system is not swapping it is probably fine. Have you ever
    seen it swap?
    And, as you say, rabbit thinks it is only using 330MB. Is the
    rabbit Erlang process considerably bigger than that?
    No, it is in that range.
    In which case it's clearly not rabbit that is using the memory. So don't
    blame the poor bunny.
    We saw exactly this behaviour in our environment as well : steadily
    increasing memory usage over time, related to Rabbit virtual memory
    usage increasing over time ; however Rabbit real memory usage remained
    relatively low.

    The culprit in our case was that we had created far too many queues, and
    that we weren't clearing them out when they weren't being used. Once we
    started managing this properly (read : architecting our queues more
    efficiently, destroying them when not in use) our memory usage plummeted
    - real and virtual - plummeted.

    Cheers,


    --
    Daniel Maher
    ? can't talk, too busy calculating computrons. ?
  • Mark Geib at May 3, 2011 at 3:28 pm
    Quick update. At least in our case this issue was completely resolved
    by upgrading erlang to R14B02, from R14A02.

    During our investigation we removed all plugins, with no change. We
    have since restored the admin plugins, etc. and see no issues. There
    is a slight performance hit it appears with the admin plugins enabled,
    but nothing significant.

    Mark.
    On 05/03/2011 01:46 AM, Daniel Maher wrote:
    On Mon, 2011-05-02 at 20:29 +0100, Matthias Radestock wrote:
    Gavin M. Roy wrote:
    On Monday, May 2, 2011 at 2:48 PM, Matthias Radestock wrote:
    The above graph shows system memory usage. It is perfectly
    normal for a non-idle linux system to gradually fill up all
    the memory with cached files.
    Which I expect in disk buffers in that graph. The inactive
    memory is what threw me, going back and re-reading up on it,
    this is memory that previously been allocated in vm that can be
    reclaimed for other use, correct?
    I have no idea. Consult your resident Linux expert ;)

    As long as the system is not swapping it is probably fine. Have
    you ever seen it swap?
    And, as you say, rabbit thinks it is only using 330MB. Is
    the rabbit Erlang process considerably bigger than that?
    No, it is in that range.
    In which case it's clearly not rabbit that is using the memory.
    So don't blame the poor bunny.
    We saw exactly this behaviour in our environment as well :
    steadily increasing memory usage over time, related to Rabbit
    virtual memory usage increasing over time ; however Rabbit real
    memory usage remained relatively low.

    The culprit in our case was that we had created far too many
    queues, and that we weren't clearing them out when they weren't
    being used. Once we started managing this properly (read :
    architecting our queues more efficiently, destroying them when not
    in use) our memory usage plummeted - real and virtual - plummeted.

    Cheers,
    - --
    Principal Engineer
    Cheyenne Software Engineering
    mark.geib at echostar.com / 35-215

    PGP fingerprint:6DFC 389D 9796 0188 92E5 58F5 34C5 6B47 D091 76FD
  • Matthias Radestock at May 3, 2011 at 3:34 pm

    On 03/05/11 08:46, Daniel Maher wrote:
    On Mon, 2011-05-02 at 20:29 +0100, Matthias Radestock wrote:
    Gavin M. Roy wrote:
    And, as you say, rabbit thinks it is only using 330MB. Is the
    rabbit Erlang process considerably bigger than that?
    No, it is in that range.
    In which case it's clearly not rabbit that is using the memory. So don't
    blame the poor bunny.
    We saw exactly this behaviour in our environment as well : steadily
    increasing memory usage over time, related to Rabbit virtual memory
    usage increasing over time ; however Rabbit real memory usage remained
    relatively low.
    That is *not* the same behaviour as Gavin is describing. He is seeing
    the rabbit process memory usage (both as reported by rabbit itself and
    by the O/S) staying fairly modest at around 330M of virtual memory.
    However, the memory usage on the machine overall is increasing over time.

    Regards,

    Matthias.
  • Gavin M. Roy at May 4, 2011 at 7:42 pm

    On Mon, May 2, 2011 at 3:29 PM, Matthias Radestock wrote:

    Gavin,

    Gavin M. Roy wrote:
    On Monday, May 2, 2011 at 2:48 PM, Matthias Radestock wrote:

    The above graph shows system memory usage. It is perfectly normal for a
    non-idle linux system to gradually fill up all the memory with cached
    files.
    Which I expect in disk buffers in that graph. The inactive memory is what
    threw me, going back and re-reading up on it, this is memory that previously
    been allocated in vm that can be reclaimed for other use, correct?
    I have no idea. Consult your resident Linux expert ;)

    So some more color, I decided to dive deeper into this to see what I could
    find. The short version is I jumped into erl using -remsh and ran
    erlang:garbage_collect(). and saw an immediate drop in that inactive memory.

    I am using R14B01. I am reluctant to let that inactivity hit the upper
    bounds of my ram, but at the same time, I want to see if the erlang vm is
    just allocating until it can't and then will reclaim usage.

    I did however with a certain level of clarity find that by issuing the
    command in the erlang attached process, inactive memory held in the linux
    kernel dropped by 83%.

    Commands run:

    $ erl -sname temp -remsh rabbit at rabbit03
    1> erlang:garbage_collect().
    true

    May not be the bunny, but it seems to the vm.

    Gavin
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110504/59756dd3/attachment.htm>
  • David Wragg at May 5, 2011 at 10:23 am

    "Gavin M. Roy" <gmr at myyearbook.com> writes:
    So some more color, I decided to dive deeper into this to see what I could
    find. The short version is I jumped into erl using -remsh and ran
    erlang:garbage_collect(). and saw an immediate drop in that inactive
    memory.
    By inactive memory, you mean that measured by the "Inactive" line in
    /proc/meminfo?
    I am using R14B01. I am reluctant to let that inactivity hit the upper
    bounds of my ram, but at the same time, I want to see if the erlang vm is
    just allocating until it can't and then will reclaim usage.
    Inactive memory, in the Linux MM sense, is merely memory that has been
    touched less recently, and would therefore be more eligible to be
    reclaimed when the kernel wants to free up memory. Unless you are
    looking into the internal details of Linux memory management, it is not
    a particularly interesting number. It is completely unrelated to erlang
    VM memory allocations, and certainly does not indicate a memory leak.

    David

    --
    David Wragg
    Staff Engineer, RabbitMQ
    VMware, Inc.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedApr 27, '11 at 11:31p
activeMay 5, '11 at 10:23a
posts15
users8
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2022 Grokbase