I found that Solaris continued to crash regardless of what settings I
used for the high watermark. As memsup on Solaris was questionable I
moved to Linux.
I moved to a Quad core CentOS box with 9 GB of memory. Unfortunately,
I'm seeing similar crashes, though they take longer to occur.
What I'm seeing is usually some variation of the following:
1. Start broker clean
2. Start 1 consumer
3. Start 10 producers that publish as fast as they can (we're trying
to stress the system)
4. Once system memory reaches the high water mark throttling occurs
(I've tried settings from about 40% - 95%, the rest of the
observations will refer to measurements at 70%).
5. Throttling toggles on and off a few times, between 3 - 5 times, and
then all clients (including the consumer) get disconnected.
6. Memory sores to over 90%.
6. Sometimes the erlang process crashes at this point, other times all
producers and consumers reconnect and within about 30 seconds the
erlang process crashes. In most cases the producers never produce
after the reconnect, but on some occasions the consumer does receive
messages before dying.
I understand that this is pretty excessive in terms of stress.
However, without going into details, it is very important that I
demonstrate that RabbitMQ degrades gracefully under high load.
On a more positive note, I'm seeing RabbitMQ outperform my current JMS
provider by a factor of ~15x on Solaris!
Any help would be appreciated.
On Mon, Mar 23, 2009 at 2:31 PM, Chris Pettitt wrote:
Thank you for the pointer. I should have looked more closely at the
Cursory testing reveals that RabbitMQ is able to use flow control
without crashing with a high watermark of 0.7, while 0.75 or higher
results in the Erlang node crashing (before memsup detects the low
memory condition?). I'll follow up if I learn anything more about
memsup's behavior on Solaris.
On Mon, Mar 23, 2009 at 1:14 PM, Matthias Radestock wrote:
Chris Pettitt wrote:
I found an article that talks about flow control in RabbitMQ 1.5.0
. It doesn't seem to work in our 1.5.3 setup. When I start rabbit
from erl, I see that the memsup app is not started. I checked the
rabbitmq-server configuration and see "-os_mon start_memsup false".
I've tried setting start_memsup to true and I've also tried starting
memsup manually before calling rabbit:start(). Neither seems to cause
flow control information to be logged and both still result in the
erlang node crashing.
I very strongly suspect user error, but I'd appreciate some guidance
on how to enable this feature.
- you are supposed to be
using "-rabbit memory_alarms true". ?Enabling memsup the way you did should
be ok too though as long the server isn't low on memory to start with and
you wait at least a minute before stressing it. You may need to tweak the
threshold. Also, we don't know whether memsup on Solaris is producing the
right information, which is why rabbit leaves it turned off by default on
that platform. So if you can do some testing/investigation that would be