Hi,

Looking at report, I see the following:
* there's one channel with 1524 unconfirmed messages:
<'rabbit at server02'.3.374.49> <'rabbit at server02'.3.366.49> 1 guest / false false 1 1524 0 0 0 0 false

These are messages that were sent to a consumer but not ack'd (so
they can't be delivered to other consumers);

* the broker currently has 1524 messages sitting on two queues:
<'rabbit at server02'.3.241.0> number_3 true false [] 0 1524 1524 1 ...
<'rabbit at server02'.3.244.0> number_6 true false [] 1524 0 1524 0

Both queues have 1524 messages sitting on them, but the kinds of
messages are different. The messages on number_6 are all ready,
which means that once a consumer connects to the queue, they will
all be delivered (but no consumer was connected to the queues). The
messages on number_3, on the other hand, are unacknowledged; so,
they were delivered to the sole consumer, but it didn't acknowledge
them.

This is consistent with what you describe, since a broker restart
will mark the unacknowledged messages as ready again, and deliver
them to a consumer.

This means that the consumer for number_3 has received messages, but
hasn't acknowledged them. This can only happen if you set ack_mode in
the shovel configuration:
* if ack_mode is on_publish, the shovel is having trouble forwarding
consumed messages to their destination;
* if ack_mode is on_confirm, the destination is not confirming the
messages the shovel sent it.

The former could be caused by a network issue (the logs list a
connection failing with a timeout), while the latter
shouldn't be possible (the RabbitMQ broker guarantees it will eventually
confirm all received messages).

So, to fix this, you could try:
* playing around with the ack_mode setting;
* upgrading (the other poster mentioned this fixed his problem).

What versions of RabbitMQ are you running (both on the source and
destination brokers)? Is there any chance you could send us your shovel
config? (or at least describe what it does)

Cheers,
Alex

On Wed, Nov 02, 2011 at 06:31:34AM -0700, Jelle Smet wrote:
Hi Alex,

The requested output is in attachment. I have string replaced server and
queue names with fictional names.

I could find following non INFO report in the rabbit at server02.log file:

... snip ...
=WARNING REPORT==== 2-Nov-2011::13:56:10 ===
exception on TCP connection <0.24639.172> from 10.96.72.91:58172
connection_closed_abruptly
... snip ...

Issuing "rabbitmqctl stop" gave following output and kept hanging endless:
Stopping and halting node 'rabbit at server02' ...

When doing a strace -p on the open running rabbitmq process I get following
output for each one of them:
#strace -p 26367
Process 26367 attached - interrupt to quit
select(0, NULL, NULL, NULL, NULL

The only thing left at this stage is kill all processes and restart server02

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 17 | next ›
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedSep 15, '11 at 3:58a
activeNov 22, '11 at 2:49p
posts17
users5
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2018 Grokbase