Hi All,

Trying to setup Federation across two RabbitMQ clusters (2.7.1, R15B).
Everything seems to start up fine, but then I have a constant stream of
disconnects, in some cases it doesn't clear the connection, just keeps
adding more (killed it after reaching several hundred).
It seems to start once I bind a queue to the exchange (dc001-exchange-d)
on the DOWNSTREAM server.


******************** CONFIG on UPSTREAM ********************
==> enabled_plugins <==
[rabbitmq_management,rabbitmq_management_agent,rabbitmq_management_visualiser,rabbitmq_federation].

==> rabbitmq.config <==
[
{rabbit, [{vm_memory_high_watermark, 0.6},
{collect_statistics_interval, 5000},
{hipe_compile, true}
]
},
{mnesia, [{dc_dump_limit, 40},
{dump_log_write_threshold, 50000},
{send_compressed, 9},
{snmp, true}
]
},
{rabbitmq_management, [ {http_log_dir,
"/data/rabbitmq/dc001/rabbit-mgmt"} ] },
{rabbitmq_management_agent, [ {force_fine_statistics, true} ] },

{rabbitmq_federation, [{exchanges, [
[{exchange, "dc001-exchange-d"},
{virtual_host, "/"}, {type, "fanout"}, {durable, true}, {auto_delete,
false}, {internal, false}, {upstream_set, "dc001-servers"} ]
]},
{upstream_sets, [
{"dc001-servers", [
[{connection, "host-001"},
{exchange, "dc001-exchange-u"}, {max_hops, 1}]
]}
]},
{connections, [
{"host-001", [{host,
"host-001.domain.com"}, {protocol, "amqp"}, {port, 5672},
{mechanism,
default}, {prefetch_count, 1000}, {virtual_host, "/"},
{username,
"guest"}, {password, "guest"}, {reconnect_delay, 10},
{heartbeat, 5}]}
]},
{local_username, "guest"}
]}
].

==> rabbitmq-env.conf <==
NODENAME=dc001
BASE=/data/rabbitmq/dc001
MNESIA_BASE=/data/rabbitmq/dc001/mnesia
LOG_BASE=/data/rabbitmq/dc001/log
SERVER_START_ARGS="+K true -smp enable"


******************** Log on UPSTREAM ********************
=INFO REPORT==== 20-Jan-2012::16:15:27 ===
accepted TCP connection on 0.0.0.0:5672 from 192.168.0.48:49407

=INFO REPORT==== 20-Jan-2012::16:15:27 ===
starting TCP connection <0.32601.22> from 192.168.0.48:49407

=ERROR REPORT==== 20-Jan-2012::16:15:27 ===
connection <0.32601.22>, channel 2 - error:
{amqp_error,not_found,
"no exchange 'federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d A' in vhost '/'",
'exchange.delete'}

=ERROR REPORT==== 20-Jan-2012::16:30:42 ===
connection <0.32601.22>, channel 1 - error:
{amqp_error,not_found,"no exchange 'dc001-exchange-u' in vhost '/'",
'exchange.bind'}

=INFO REPORT==== 20-Jan-2012::16:30:42 ===
closing TCP connection <0.32601.22> from 192.168.0.48:49407

=INFO REPORT==== 20-Jan-2012::16:30:42 ===
accepted TCP connection on 0.0.0.0:5672 from 192.168.0.48:55900

=INFO REPORT==== 20-Jan-2012::16:30:42 ===
starting TCP connection <0.7666.23> from 192.168.0.48:55900

=ERROR REPORT==== 20-Jan-2012::16:30:42 ===
connection <0.7666.23>, channel 2 - error:
{amqp_error,not_found,
"no exchange 'federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d A' in vhost '/'",
'exchange.delete'}

=ERROR REPORT==== 20-Jan-2012::16:30:42 ===
connection <0.7666.23>, channel 1 - error:
{amqp_error,not_found,"no exchange 'dc001-exchange-u' in vhost '/'",
'exchange.bind'}





******************** Log on DOWNSTREAM ********************
=INFO REPORT==== 20-Jan-2012::16:15:27 ===
Federation exchange 'dc001-exchange-d' in vhost '/' connected to
host-001.domain.com:5672:/:dc001-exchange-u

=ERROR REPORT==== 20-Jan-2012::16:30:42 ===
** Generic server <0.4323.0> terminating
** Last message in was {'$gen_cast',
{enqueue,1,
{add_binding,
{binding,

{resource,<<"/">>,exchange,<<"dc001-exchange-d">>},
<<>>,
{resource,<<"/">>,queue,
<<"dc.omniture:mongodb:all">>},
[]}}}}
** When Server state == {state,
{upstream,

{amqp_params_network,<<"guest">>,<<"guest">>,
<<"/">>,"host-001.domain.com",5672,0,0,
5,infinity,none,
[#Fun<amqp_auth_mechanisms.plain.3>,

#Fun<amqp_auth_mechanisms.amqplain.3>],
[],[]},
<<"dc001-exchange-u">>,1000,1,10,none,none,
"host-001"},
<0.4340.0>,<0.4350.0>,
<<"federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d">>,
<<"federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d B">>,
{0,nil},
1,
{dict,0,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[]}}},
<0.4327.0>,<0.4335.0>,

{resource,<<"/">>,exchange,<<"dc001-exchange-d">>},
{0,nil}}
** Reason for termination ==
** {{shutdown,
{server_initiated_close,404,
<<"NOT_FOUND - no exchange 'dc001-exchange-u' in vhost '/'">>}},
{gen_server,call,
[<0.4350.0>,
{call,
{'exchange.bind',0,
<<"federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d B">>,
<<"dc001-exchange-u">>,<<>>,false,[]},
none,<0.4323.0>},
infinity]}}

=ERROR REPORT==== 20-Jan-2012::16:30:42 ===
** Generic server <0.12029.0> terminating
** Last message in was {'$gen_cast',maybe_go}
** When Server state == {not_started,
{{upstream,
{amqp_params_network,<<"guest">>,
<<"guest">>,<<"/">>,
"host-001.domain.com",5672,0,0,5,
infinity,none,
[#Fun<amqp_auth_mechanisms.plain.3>,

#Fun<amqp_auth_mechanisms.amqplain.3>],
[],[]},
<<"dc001-exchange-u">>,1000,1,10,none,none,
"host-001"},
{resource,<<"/">>,exchange,
<<"dc001-exchange-d">>}}}
** Reason for termination ==
** {{shutdown,
{server_initiated_close,404,
<<"NOT_FOUND - no exchange 'dc001-exchange-u' in vhost '/'">>}},
{gen_server,call,
[<0.12056.0>,
{call,
{'exchange.bind',0,
<<"federation: dc001-exchange-u ->
dc001 at host-007.domain.com:dc001-exchange-d A">>,
<<"dc001-exchange-u">>,<<>>,false,[]},
none,<0.12029.0>},
infinity]}}

Search Discussions

  • Simon MacMullen at Jan 23, 2012 at 10:56 am

    On 20/01/12 22:14, DawgTool wrote:
    <<"NOT_FOUND - no exchange 'dc001-exchange-u' in vhost '/'">>}},
    Does the upstream exchange exist? The federation plugin will not create
    it for you.

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware
  • DawgTool at Jan 23, 2012 at 7:46 pm
    Hi Simon,

    ok, created the exchange before starting the downstream server.
    Federation exchange is created, but it is not bound to the dc001-exchange-u.
    Am I missing something?

    On 1/23/12 5:56 AM, Simon MacMullen wrote:
    On 20/01/12 22:14, DawgTool wrote:
    <<"NOT_FOUND - no exchange 'dc001-exchange-u' in vhost '/'">>}},
    Does the upstream exchange exist? The federation plugin will not
    create it for you.

    Cheers, Simon
  • Simon MacMullen at Jan 24, 2012 at 11:43 am

    On 23/01/12 19:46, DawgTool wrote:
    ok, created the exchange before starting the downstream server.
    Federation exchange is created, but it is not bound to the
    dc001-exchange-u.
    Am I missing something?
    What do you mean by not bound? The exchanges should be on different
    machines so they can't be bound in the usual sense of the word. What
    were you expecting to see, and what did you see instead?

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware
  • DawgTool at Jan 24, 2012 at 5:19 pm
    Hi Simon,

    It appears there was a delay on dc001-exchange-u binding to federation:
    dc001........

    Not sure the exact time it took before I could send messages to that
    exchange (complained about no route), but eventually it started working.
    I'll run some additional test and respond back to this thread.

    Thank you.

    On 1/24/12 6:43 AM, Simon MacMullen wrote:
    On 23/01/12 19:46, DawgTool wrote:
    ok, created the exchange before starting the downstream server.
    Federation exchange is created, but it is not bound to the
    dc001-exchange-u.
    Am I missing something?
    What do you mean by not bound? The exchanges should be on different
    machines so they can't be bound in the usual sense of the word. What
    were you expecting to see, and what did you see instead?

    Cheers, Simon
  • DawgTool at Jan 24, 2012 at 6:37 pm
    Hi Simon,

    On a side note; the performance I am seeing between cluster is very
    unexpected.
    Publishing to the federated exchange seems to consume about ~4 times the
    CPU as a typical exchange.
    The downstream cluster also sees about the same level of cpu usage.

    Setup:
    Cluster(001) dc001-exchange-u (fanout)
    Cluster(002) dc001-exchange-d (fanout) -> dc001-queue-d

    Test 1:
    Messages: ~3K/sec
    Network: 8.5KB/sec
    Cluster(001) CPUs: 45% consumed
    Cluster(002) CPUs: 47% consumed

    Test 2:
    Messages: ~11K/sec
    Network: 38KB/sec
    Cluster(001) CPUs: 97% consumed
    Cluster(002) CPUs: 97% consumed

    What I was hoping to see that Cluster(002) would be about half the cpu
    consumption as Cluster(001).
    Or something like the scenario of publishing to a cluster on host 1 with
    the queue on host 2.
    Host 1 would take on the message handling from the publisher and hand it
    off to host 2 which writes to the queue (spreads the load across the
    cluster, but adds a little overhead).

    Please let me know if I have missed something, or if you can think of
    any tricks to reduce the cpu usage.
    Thanks


    On 1/24/12 12:19 PM, DawgTool wrote:
    Hi Simon,

    It appears there was a delay on dc001-exchange-u binding to
    federation: dc001........

    Not sure the exact time it took before I could send messages to that
    exchange (complained about no route), but eventually it started working.
    I'll run some additional test and respond back to this thread.

    Thank you.

    On 1/24/12 6:43 AM, Simon MacMullen wrote:
    On 23/01/12 19:46, DawgTool wrote:
    ok, created the exchange before starting the downstream server.
    Federation exchange is created, but it is not bound to the
    dc001-exchange-u.
    Am I missing something?
    What do you mean by not bound? The exchanges should be on different
    machines so they can't be bound in the usual sense of the word. What
    were you expecting to see, and what did you see instead?

    Cheers, Simon
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Simon MacMullen at Jan 25, 2012 at 4:11 pm

    On 24/01/12 18:37, DawgTool wrote:
    What I was hoping to see that Cluster(002) would be about half the cpu
    consumption as Cluster(001).
    Or something like the scenario of publishing to a cluster on host 1 with
    the queue on host 2.
    Host 1 would take on the message handling from the publisher and hand it
    off to host 2 which writes to the queue (spreads the load across the
    cluster, but adds a little overhead).
    Remember that when you publish across federation there is more than one
    publish involved! The federation plugin:

    * Binds a queue to the upstream exchange
    * Consumes messages from this queue
    * Republishes them downstream

    and then presumably something consumes them downstream too.

    So there is more work going on. I'm surprised it's 4x though, but the
    federation plugin has not received any performance testing yet.

    I'm also surprised both nodes in both clusters were equally busy - or
    does your test load balance across them?
    Please let me know if I have missed something, or if you can think of
    any tricks to reduce the cpu usage.
    Have you turned on hipe_compile? That should help at least.

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware
  • DawgTool at Jan 25, 2012 at 4:56 pm
    Hi Simon,

    Thanks for the details, I am no expert (greenish would be the word) in
    the internals of RabbitMQ.
    I am running if HIPE:
    {rabbit, [{vm_memory_high_watermark, 0.6},
    {collect_statistics_interval, 5000},
    {hipe_compile, true}
    ]
    },
    {mnesia, [{dc_dump_limit, 40},
    {dump_log_write_threshold, 50000},
    {send_compressed, 9},
    {snmp, true}
    ]
    },

    SERVER_START_ARGS="+K true -smp enable"

    I was also surprised on the performance numbers compared to other tests.
    I downgraded both clusters to single nodes and still have the same
    percentages.
    I'm tempted to start testing the shovel plugin (I need all messages to
    travel between sites),
    but I really like the setup of Federation and the fail-over of the
    exchange (haven't tested
    that yet, but assume it works. LOL).

    Thanks


    On 1/25/12 11:11 AM, Simon MacMullen wrote:
    On 24/01/12 18:37, DawgTool wrote:
    What I was hoping to see that Cluster(002) would be about half the cpu
    consumption as Cluster(001).
    Or something like the scenario of publishing to a cluster on host 1 with
    the queue on host 2.
    Host 1 would take on the message handling from the publisher and hand it
    off to host 2 which writes to the queue (spreads the load across the
    cluster, but adds a little overhead).
    Remember that when you publish across federation there is more than
    one publish involved! The federation plugin:

    * Binds a queue to the upstream exchange
    * Consumes messages from this queue
    * Republishes them downstream

    and then presumably something consumes them downstream too.

    So there is more work going on. I'm surprised it's 4x though, but the
    federation plugin has not received any performance testing yet.

    I'm also surprised both nodes in both clusters were equally busy - or
    does your test load balance across them?
    Please let me know if I have missed something, or if you can think of
    any tricks to reduce the cpu usage.
    Have you turned on hipe_compile? That should help at least.

    Cheers, Simon
  • Simon MacMullen at Jan 25, 2012 at 6:16 pm

    On 25/01/12 16:56, DawgTool wrote:
    I was also surprised on the performance numbers compared to other tests.
    I downgraded both clusters to single nodes and still have the same
    percentages.
    Hmm. That is odd. I don't think there's been any performance testing of
    federation + clustering yet; I'll have to do some at some point.

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedJan 20, '12 at 10:14p
activeJan 25, '12 at 6:16p
posts9
users2
websiterabbitmq.com
irc#rabbitmq

2 users in discussion

DawgTool: 5 posts Simon MacMullen: 4 posts

People

Translate

site design / logo © 2022 Grokbase