Hello,
I am trying to implement a work queue in RabbitMQ. I have several machines
that will act as worker pools for consuming my queues. I would like each
job to be processed only once. Some jobs have environmental requirements,
ie must be executed on a machine with an SSD. Not all worker pools will
meet these requirements. A first approach would be to have a queue for
every permutation of requirements: "No Requirements", "Requires SSD",
"Requires Certificate", etc and have each worker pool subscribe to all
queues which it can handle.

A majority of the jobs will have no requirements, so many worker pools will
be underutilized. To fight this, I am hoping to distribute jobs in a
round-robin fashion to all queues that can handle them. For example, a
no-requirements job can be handled by any worker pool, and therefore should
be distributed equally to all of them. That is, jobs w/ no requirements
will be evenly distributed to "No Requirements", "Requires SSD", "Requires
Certificate", etc.

I tried this with topic exchanges, but of course they will distribute the
job to ALL matching queues, rather than a single arbitrary queue.

Some terrible alternative approaches I have so far:
1. Have producers submit to a single queue, and have a home-grown "load
balancer" consumer on the other end which distributes jobs to appropriate
queues.
2. When a worker pool spins up, rather than have each worker consume a
dedicate queue, have at least some of them hop between queues in an attempt
to keep them busy.
3. Hacks with timestamps/modulo arithmetic in the routing key to trick the
exchange into being round robin.

Being new to RabbitMQ, my intuition tells me #1 will be a big performance
problem, #2 will be a performance and maintenance problem, and #3 just won't
end up working. I suspect I'm missing something big here :)

Has anyone successfully load-balanced their workers in a situation like
this?

Any advice much appreciated.
Adam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111010/031ba60c/attachment.htm>

Search Discussions

  • Alexis Richardson at Oct 11, 2011 at 6:05 am
    Adam

    Would this solve your problem?

    http://twitter.com/#!/hylomorphism/status/117667626880221184

    Apologies if I misunderstood.

    a

    On Tue, Oct 11, 2011 at 2:48 AM, Adam Rabung wrote:
    Hello,
    I am trying to implement a work queue in RabbitMQ. ?I have several machines
    that will act as worker pools for consuming my queues. ?I would like each
    job to be processed only once. ?Some jobs have?environmental?requirements,
    ie must be executed on a machine with an SSD. ?Not all worker pools will
    meet these requirements. ?A first approach would be to have a queue for
    every permutation of requirements: "No Requirements", "Requires SSD",
    "Requires Certificate", etc and have each worker pool subscribe to all
    queues which it can handle.
    A majority of the jobs will have no requirements, so many worker pools will
    be underutilized. ?To fight this, I am hoping to distribute jobs in a
    round-robin fashion to all queues that can handle them. ?For example, a
    no-requirements job can be handled by any worker pool, and therefore should
    be distributed equally to all of them. ?That is, jobs w/ no requirements
    will be evenly distributed to?"No Requirements", "Requires SSD", "Requires
    Certificate", etc.
    I tried this with topic exchanges, but of course they will distribute the
    job to ALL matching queues, rather than a single arbitrary queue.
    Some terrible alternative approaches I have so far:
    1. Have producers submit to a single queue, and have a home-grown "load
    balancer" consumer on the other end which distributes jobs to appropriate
    queues.
    2. When a worker pool spins up, rather than have each worker consume a
    dedicate queue, have at least some of them hop between queues in an attempt
    to keep them busy.
    3. Hacks with timestamps/modulo arithmetic in the routing key to trick the
    exchange into being round robin.
    Being new to RabbitMQ, my intuition tells me #1 will be a big performance
    problem, #2 will be a performance and maintenance problem, and #3 just won't
    end up working. ?I suspect I'm missing something big here :)
    Has anyone successfully load-balanced their workers in a situation like
    this?
    Any advice much appreciated.
    Adam
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Matthias Radestock at Oct 11, 2011 at 6:19 am
    Adam,
    On 11/10/11 02:48, Adam Rabung wrote:
    Hello,
    I am trying to implement a work queue in RabbitMQ. I have several
    machines that will act as worker pools for consuming my queues. I would
    like each job to be processed only once. Some jobs
    have environmental requirements, ie must be executed on a machine with
    an SSD. Not all worker pools will meet these requirements. A first
    approach would be to have a queue for every permutation of requirements:
    "No Requirements", "Requires SSD", "Requires Certificate", etc and have
    each worker pool subscribe to all queues which it can handle.
    That is a sound approach.
    A majority of the jobs will have no requirements, so many worker pools
    will be underutilized.
    As you say, the worker pools would subscribe to all the queues with
    combinations of requirements they can handle. That includes the "No
    Requirements" queue. The subscriptions can be active *simultaneously*,
    so work in the "No Requirements" queue would be round-robin routed to
    all workers, jobs in the "Require SSD" queue will be routed to all
    workers that have SSDs, etc.

    If each worker uses a single channel with a basic.qos prefetch setting
    of 1 and subscribes to all the relevant queues on that channel, it will
    be fed work items one at a time from the subset of all these queues that
    have messages in them. There's some logic at the server that ensures
    this is reasonably fair, though you will most likely find that queues
    which can be handled by many workers drain faster than others.

    Regards,

    Matthias.
  • Adam Rabung at Oct 11, 2011 at 2:52 pm
    That was the big thing I was missing - channels can be subscribed to more
    than one queue! Thank you so much, that was a nice breakthrough.
    Adam

    On Tue, Oct 11, 2011 at 2:19 AM, Matthias Radestock
    wrote:
    Adam,

    On 11/10/11 02:48, Adam Rabung wrote:

    Hello,
    I am trying to implement a work queue in RabbitMQ. I have several
    machines that will act as worker pools for consuming my queues. I would
    like each job to be processed only once. Some jobs
    have environmental requirements, ie must be executed on a machine with
    an SSD. Not all worker pools will meet these requirements. A first
    approach would be to have a queue for every permutation of requirements:
    "No Requirements", "Requires SSD", "Requires Certificate", etc and have
    each worker pool subscribe to all queues which it can handle.
    That is a sound approach.


    A majority of the jobs will have no requirements, so many worker pools
    will be underutilized.
    As you say, the worker pools would subscribe to all the queues with
    combinations of requirements they can handle. That includes the "No
    Requirements" queue. The subscriptions can be active *simultaneously*, so
    work in the "No Requirements" queue would be round-robin routed to all
    workers, jobs in the "Require SSD" queue will be routed to all workers that
    have SSDs, etc.

    If each worker uses a single channel with a basic.qos prefetch setting of 1
    and subscribes to all the relevant queues on that channel, it will be fed
    work items one at a time from the subset of all these queues that have
    messages in them. There's some logic at the server that ensures this is
    reasonably fair, though you will most likely find that queues which can be
    handled by many workers drain faster than others.

    Regards,

    Matthias.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111011/22e1fc5c/attachment.htm>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedOct 11, '11 at 1:48a
activeOct 11, '11 at 2:52p
posts4
users3
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2022 Grokbase