Can a message be ack'd by a different machine than the one that read
it off the queue?

I'm working on a distributed stream processing system where it would
be useful to have a downstream node do the acknowledgement.

Thanks in advance,
Nathan

Search Discussions

  • Jerry Kuch at Dec 10, 2010 at 1:11 am
    Hi, Nathan...

    I believe the delivery tag you'd use in ack-ing is only valid within the channel in which the message was received so the scheme you propose wouldn't be viable and you'd have to devise some scheme for signifying done-ness with the message's implied work beyond basic AMQP.

    BTW, Cascalog is a neat piece of work. Would be interested to know more about the system you're building if it's not confidential... :-)

    Jerry

    Sent from my iPhone (Brevity and typos are hopefully the result of 1-fingered typing rather than rudeness or illiteracy).

    On Dec 9, 2010, at 4:47 PM, "nathanmarz" wrote:

    Can a message be ack'd by a different machine than the one that read
    it off the queue?

    I'm working on a distributed stream processing system where it would
    be useful to have a downstream node do the acknowledgement.

    Thanks in advance,
    Nathan
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Nathanmarz at Dec 10, 2010 at 1:21 am
    Let's just say I'm working on the "Cascalog for stream-processing",
    although the semantics of the tool will be quite different of course.
    The goal is to be able to do distributed stream processing with the
    details of fault tolerance and messaging abstracted away. The rest I'm
    keeping secret for now :)

    Alternatively, is it possible to ack messages out of order from how
    you read them off the queue? Also, is it possible to set a timeout on
    acking a message, so that if a message isn't acked within X secs it's
    automatically considered failed and scheduled for redelivery?

    Thanks,
    Nathan


    On Dec 9, 5:11?pm, Jerry Kuch wrote:
    Hi, Nathan...

    I believe the delivery tag you'd use in ack-ing is only valid within the channel in which the message was received so the scheme you propose wouldn't be viable and you'd have to devise some scheme for signifying done-ness with the message's implied work beyond basic AMQP.

    BTW, Cascalog is a neat piece of work. ?Would be interested to know more about the system you're building if it's not confidential... ? :-)

    Jerry

    Sent from my iPhone (Brevity and typos are hopefully the result of 1-fingered typing rather than rudeness or illiteracy).
    On Dec 9, 2010, at 4:47 PM, "nathanmarz" wrote:

    Can a message be ack'd by a different machine than the one that read
    it off the queue?
    I'm working on a distributed stream processing system where it would
    be useful to have a downstream node do the acknowledgement.
    Thanks in advance,
    Nathan
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Jerry Kuch at Dec 10, 2010 at 10:45 pm
    Hi, Nathan...

    (cc-ing rabbitmq-discuss so that other folks who've been around Rabbit longer than I have can chime in as well).

    Thanks for the additional motivation... I think variations on this request have come up in discussion before, so let's keep some other eyes on the topic...

    So in this system I'm developing, a message from a persistent queue like RabbitMQ will trigger a computation across the cluster, creating potentially many messages and involving many machines. I don't want to ack the message until all the computations it has triggered has finished. I have a way of detecting this efficiently and reliably, and it's a different machine than the one that read the message off the queue that will know the message is ready to be acked.

    Out of curiosity what are you using as your distributed coordination mechanism over the other machines and processes involved in your cluster? I wonder if perhaps there's some information available to or through it that your Rabbit-facing processes might be able to use straightforwardly...

    If there's a failure at any point (like an exception or machine going down), the system will never detect that the computations for the message has finished. So I want to handle this by putting a time limit on how long the system has to process the message. This is where setting timeouts on acking would come in, and it would be very useful.

    Gotcha... I follow where you're going...

    What I'm going to have to do to get RabbitMQ viable as a source for this system is handle acking/timeouts on the node that reads from RabbitMQ. When the downstream node detects that the message is finished, it will have to send a message to the reader node to ack the message. The reader node will also have to cache messages in memory and have a separate thread explicitly reject messages that are in the cache for too long. Not terrible, but would be nice if this was native in RabbitMQ.

    Right. As stated your design sketch sounds totally viable although, as you say, it does put work into your application that one might prefer to have done elsewhere... :-)

    You get correctness out of the deal in that your intermediate reader process that owns the channel on which the original do-work message was de-queued will make the message available for later consumption by someone else if it dies, but of course it has to implement its own safekeeping and timeout logic for deciding when to give up on things. Via basic.reject it also gets the option of throwing the message back to be handled again later or condemning it to oblivion.

    Jerry


    On Fri, Dec 10, 2010 at 9:55 AM, Jerry Kuch <jerryk at vmware.comwrote:
    Hi, Nathan...
    On Dec 9, 2010, at 5:21 PM, nathanmarz wrote:

    Let's just say I'm working on the "Cascalog for stream-processing",
    although the semantics of the tool will be quite different of course.
    The goal is to be able to do distributed stream processing with the
    details of fault tolerance and messaging abstracted away. The rest I'm
    keeping secret for now :)
    Very interesting... I look forward to seeing it develop!
    Alternatively, is it possible to ack messages out of order from how
    you read them off the queue?
    It is indeed.

    In AMQP, acking a message tells the broker that you have received it and taken responsibility for its contents, and that the broker no longer has to take any measures to ensure its integrity or existence.

    When you take a message off a queue, say with 'basic.get', it still exists on the server, but is unavailable to other consumers until such time as you 'basic.ack' it. If you basic.ack the message, the message is removed from the server. If your channel or connection dies, the message is again made available for other consumers.

    If you pulled a smattering of messages from a queue and dispatched them to separate workers in your application, and your workers finished and sent acks in some order other than the order in which the messages were enqueued, there's no problem at all.
    Also, is it possible to set a timeout on
    acking a message, so that if a message isn't acked within X secs it's
    automatically considered failed and scheduled for redelivery?
    Such a mechanism doesn't currently exist. In practice, this usually won't be a problem since failure to ack (modulo the case of a consumer that's alive but paralyzed for some reason) is typically the result of network troubles or client death, both of which will result in the connection being closed. If the connection closes, the un-acked message on the server becomes available for other clients to get or consume.

    Are there specific aspects of your desired use case that make such a timeout-waiting-for-ack super important?

    Best regards,
    Jerry
  • Nathanmarz at Dec 10, 2010 at 11:09 pm
    There will be a master machine assigning processes to machines and
    reassigning tasks when necessary in case of failures. It'll play a
    similar role as the JobTracker does in Hadoop. Communication of who's
    supposed to communicate to who will be done using Zookeeper.

    On Dec 10, 2:45?pm, Jerry Kuch wrote:
    Hi, Nathan...

    (cc-ing rabbitmq-discuss so that other folks who've been around Rabbit longer than I have can chime in as well).

    Thanks for the additional motivation... ?I think variations on this request have come up in discussion before, so let's keep some other eyes on the topic...

    So in this system I'm developing, a message from a persistent queue like RabbitMQ will trigger a computation across the cluster, creating potentially many messages and involving many machines. I don't want to ack the message until all the computations it has triggered has finished. ?I have a way of detecting this efficiently and reliably, and it's a different machine than the one that read the message off the queue that will know the message is ready to be acked.

    Out of curiosity what are you using as your distributed coordination mechanism over the other machines and processes involved in your cluster? ?I wonder if perhaps there's some information available to or through it that your Rabbit-facing processes might be able to use straightforwardly...

    If there's a failure at any point (like an exception or machine going down), the system will never detect that the computations for the message has finished. So I want to handle this by putting a time limit on how long the system has to process the message. This is where setting timeouts on acking would come in, and it would be very useful.

    Gotcha... ?I follow where you're going...

    What I'm going to have to do to get RabbitMQ viable as a source for this system is handle acking/timeouts on the node that reads from RabbitMQ. When the downstream node detects that the message is finished, it will have to send a message to the reader node to ack the message. The reader node will also have to cache messages in memory and have a separate thread explicitly reject messages that are in the cache for too long. Not terrible, but would be nice if this was native in RabbitMQ.

    Right. ?As stated your design sketch sounds totally viable although, as you say, it does put work into your application that one might prefer to have done elsewhere... :-)

    You get correctness out of the deal in that your intermediate reader process that owns the channel on which the original do-work message was de-queued will make the message available for later consumption by someone else if it dies, but of course it has to implement its own safekeeping and timeout logic for deciding when to give up on things. ?Via basic.reject it also gets the option of throwing the message back to be handled again later or condemning it to oblivion.

    Jerry

    On Fri, Dec 10, 2010 at 9:55 AM, Jerry Kuch <jer... at vmware.comwrote:

    Hi, Nathan...
    On Dec 9, 2010, at 5:21 PM, nathanmarz wrote:

    Let's just say I'm working on the "Cascalog for stream-processing",
    although the semantics of the tool will be quite different of course.
    The goal is to be able to do distributed stream processing with the
    details of fault tolerance and messaging abstracted away. The rest I'm
    keeping secret for now :)
    Very interesting... ?I look forward to seeing it develop!
    Alternatively, is it possible to ack messages out of order from how
    you read them off the queue?
    It is indeed.

    In AMQP, acking a message tells the broker that you have received it and taken responsibility for its contents, and that the broker no longer has to take any measures to ensure its integrity or existence.

    When you take a message off a queue, say with 'basic.get', it still exists on the server, but is unavailable to other consumers until such time as you 'basic.ack' it. ?If you basic.ack the message, the message is removed from the server. ?If your channel or connection dies, the message is again made available for other consumers.

    If you pulled a smattering of messages from a queue and dispatched them to separate workers in your application, and your workers finished and sent acks in some order other than the order in which the messages were enqueued, there's no problem at all.
    Also, is it possible to set a timeout on
    acking a message, so that if a message isn't acked within X secs it's
    automatically considered failed and scheduled for redelivery?
    Such a mechanism doesn't currently exist. ?In practice, this usually won't be a problem since failure to ack (modulo the case of a consumer that's alive but paralyzed for some reason) is typically the result of network troubles or client death, both of which will result in the connection being closed. ?If the connection closes, the un-acked message on the server becomes available for other clients to get or consume.

    Are there specific aspects of your desired use case that make such a timeout-waiting-for-ack super important?

    Best regards,
    Jerry

    --
    Twitter: @nathanmarzhttp://nathanmarz.com<http://nathanmarz.com/>

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedDec 10, '10 at 12:47a
activeDec 10, '10 at 11:09p
posts5
users2
websiterabbitmq.com
irc#rabbitmq

2 users in discussion

Nathanmarz: 3 posts Jerry Kuch: 2 posts

People

Translate

site design / logo © 2022 Grokbase