As per the document RabbitMQ in Action, to achieve scalability, we need to
use RabbitMQ cluster, and there're below statements:


why doesn?t RabbitMQ replicate queue contents and state across all nodes by
default?
There are two reasons:
1 Storage space?If every cluster node had a full copy of every queue,
adding nodes
wouldn?t give you more storage capacity. For example, if one node could
store
1 GB of messages, adding two more nodes would just give you two more copies
of the same 1 GB of messages.
2 Performance?Publishing messages would require replicating those messages
to
every cluster node. For durable messages, that would require triggering disk
activity on all nodes for every message. Your network and disk load would
increase every time you added a node, keeping the performance of the cluster
the same (or possibly worse).


However, for achieving HA, the suggested approach is the built-in
active-active option for queues: mirrored queues. And it's recommended to
use:
queue_args = {"x-ha-policy" : "all" }


Is above cluster approach for scalability conflicted with the mirrored
queues approach for HA (do I misunderstand anything)? How can I achieve
both of them simultaneously?




Best regards,
Hudson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130621/39f4efc1/attachment.htm>

Search Discussions

  • Michael Klishin at Jun 21, 2013 at 9:35 am

    Hudson Jiang:


    Is above cluster approach for scalability conflicted with the mirrored queues approach for HA (do I misunderstand anything)? How can I achieve both of them simultaneously?

    "Scalability" is different things to different people but lets assume it means
    "increasing overall system throughput by adding more cluster nodes".


    Replicating data to more nodes for availability indeed may be at odds
    with the max total throughput requirement, and more so as cluster size grows.


    Identify where you absolutely need mirroring and where
    throughput is more important. Not all queues are created equal.


    MK
  • Tim Watson at Jun 21, 2013 at 10:17 am

    On 21 Jun 2013, at 09:39, Hudson Jiang wrote:
    However, for achieving HA, the suggested approach is the built-in active-active option for queues: mirrored queues. And it's recommended to use:
    queue_args = {"x-ha-policy" : "all" }

    Where does it say the "all" policy is the recommend one? Look at it like this: mirror queues (i.e., HA in RabbitMQ) buys you resilience to node failure, which might increase availability. But it doesn't buy you anything in terms of "distributing load", since HA actually adds load (as a result of replication) to the system.

    Is above cluster approach for scalability conflicted with the mirrored queues approach for HA (do I misunderstand anything)? How can I achieve both of them simultaneously?

    Clustering provides a consistent view of all the (clustered) brokers from each node, which means you can connect to any node and interact with objects (i.e., publish to exchanges or consume from queues) on any node in the cluster.


    Mirroring / HA provides replication of contents for 1 queue, that you've chosen to mirror, and you can maintain the queue's state across however many replicas you see fit by setting the appropriate policy. If you mirrored all queues across all nodes, you'd find that hurts performance. If you mirror no queues, you might loose data if one of your nodes crashes.


    Now, to Michael's point: what do *you* want to do here? Scalability is an almost meaningless term used in isolation. Are you trying to achieve:


    * higher levels of throughput than a single node can manage?
    * higher levels of availability/resilience just in case one or more nodes crash?
    * ability to add more capacity (e.g., # clients, volume of work/throughput, etc) to a live system?
    * etc??


    Note that the "A" in HA stands for "Availability" not "throughput" or "performance", though we think that acceptable levels of throughput and performance are achievable with mirror queues, considering the compromises necessary to guarantee consistency and avoid data loss.


    Other options for utilising clusters include load balancing in front of the cluster, using federation or the shovel plugin instead of clustering to distribute messages between nodes (or data centres).


    Cheers,
    Tim
  • Matthias Radestock at Jun 21, 2013 at 10:36 am

    On 21/06/13 11:17, Tim Watson wrote:
    Mirroring / HA provides replication of contents for 1 queue, that
    you've chosen to mirror, and *you can maintain the queue's state
    across however many replicas you see fit*

    That is the crucial bit.


    If you want HA *and* increased scalability then create a cluster with >2
    nodes and configure your HA policy to mirror queues across a subset of them.


    The 'exactly' and 'nodes' policies allow you to do that. See
    http://www.rabbitmq.com/ha.html#genesis.


    NB: RabbitMQ in Action predates the introduction of server-configured
    policies; the x-ha-policy queue parameter does nothing in modern
    rabbits. See the above docs for the new way of configuring HA.


    Regards,


    Matthias.
  • Simon MacMullen at Jun 21, 2013 at 10:41 am

    On 21/06/13 11:36, Matthias Radestock wrote:
    NB: RabbitMQ in Action predates the introduction of server-configured
    policies; the x-ha-policy queue parameter does nothing in modern
    rabbits. See the above docs for the new way of configuring HA.

    Which is probably why it recommends "all" mode. Prior to 3.0.0,
    "exactly" mode didn't exist, and "nodes" mode meant you baked knowledge
    of your cluster into your queues, which tended to be very awkward.


    Cheers, Simon


    --
    Simon MacMullen
    RabbitMQ, Pivotal
  • Simon MacMullen at Jun 24, 2013 at 2:20 pm
    As Matthias suggested, the "exactly" or "nodes" policies are likely to
    be what you're looking for.


    Cheers, Simon

    On 24/06/13 06:38, Hudson Jiang wrote:
    Hi Simon,

    I really appreciate your reply! However, I still cannot get how should I
    configure my RabbitMQ brokers to achieve both scalability and HA. Could
    you elaborate?


    Best regards,
    Hudson



    On Fri, Jun 21, 2013 at 6:41 PM, Simon MacMullen <simon at rabbitmq.com
    wrote:

    On 21/06/13 11:36, Matthias Radestock wrote:

    NB: RabbitMQ in Action predates the introduction of
    server-configured
    policies; the x-ha-policy queue parameter does nothing in modern
    rabbits. See the above docs for the new way of configuring HA.


    Which is probably why it recommends "all" mode. Prior to 3.0.0,
    "exactly" mode didn't exist, and "nodes" mode meant you baked
    knowledge of your cluster into your queues, which tended to be very
    awkward.

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, Pivotal



    --
    Simon MacMullen
    RabbitMQ, Pivotal

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedJun 21, '13 at 8:39a
activeJun 24, '13 at 2:20p
posts6
users5
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2017 Grokbase