I have some C# code extending SimpleRpcServer and am getting an
exception when trying to shut down. In the documentation PDF, it said
to call the Close() function on your RpcServer when you are ready to
shutdown. Since the MainLoop() function blocks, the Close() has to be
called from another thread, right? I have code that looks something
like this:

//RpcServer is a member variable
parentObj.ShuttingDown += delegate {
RpcServer.Close();
};
// glossing over the AMQP config a little here
using(var connection = factory.CreateConnection("my-amqp-server"))
using(var model = connection.CreateModel())
using(var subscription = new Subscription(model, "queue", false))
using (RpcServer = new MyRpcServer(subscription, this)) { // MyRpcServer
extends SimpleRpcServer
RpcServer.MainLoop();
}

When I do fire the ShuttingDown event from another thread, we call
Close() with no problems, and then the MainLoop throws a
NullReferenceException:

System.NullReferenceException: Object reference not set to an instance of an object.
at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)
at RabbitMQ.Client.MessagePatterns.Subscription.Close()
at RabbitMQ.Client.MessagePatterns.Subscription.System.IDisposable.Dispose()
at RabbitMQ.Client.MessagePatterns.SimpleRpcServer.MainLoop()

I followed this call stack through the RabbitMQ.Client source and didn't
see anything obvious.

This isn't causing me any harm right now (other than red flags in my
logs), but it would be if I tried to put anything important after the
MainLoop call.

Is there a cleaner way to shut down an RpcServer? Should I just swallow
this nullref when I'm shutting down?

Thanks,

--
Ryan Davis
Acceleration.net
Director of Programming Services
2831 NW 41st street, suite B
Gainesville, FL 32606

Office: 352-335-6500 x 124
Fax: 352-335-6506

Search Discussions

  • Matthias Radestock at Jun 8, 2009 at 8:12 pm
    Ryan,

    Ryan Davis wrote:
    I have some C# code extending SimpleRpcServer and am getting an
    exception when trying to shut down. In the documentation PDF, it said
    to call the Close() function on your RpcServer when you are ready to
    shutdown. Since the MainLoop() function blocks, the Close() has to be
    called from another thread, right?
    Looking at the code, I am pretty sure that Close() is meant to be called
    from inside the HandleCall/Cast functions. Calling it from anywhere else
    is probably unsafe, and the error you are getting looks
    threading-related. It would be useful to get a stack trace with line
    numbers to work out exactly where the NPE occurs.


    Regards,

    Matthias.
  • Ryan Davis at Jun 9, 2009 at 4:02 pm

    Matthias Radestock wrote:
    Looking at the code, I am pretty sure that Close() is meant to be
    called from inside the HandleCall/Cast functions.
    Really? The code comment on Close() says "Shut down the server, causing
    MainLoop() to return to its caller.", which is exactly what I want. I
    dug deeper into SimpleRpcServer, and what I really need is to break from
    MainLoop's foreach after the body of the loop has executed. The
    m_subscription is used after HandleCall / HandleCast / ProcessRequest,
    and I don't see how I could close it in HandleCall without creating
    problems later on down the line.

    If I call Close() at any point in HandleCall/Cast, then the byte[]
    intended for the RPC caller will be lost, and I expect it would throw a
    nullref when it tries to BasicPublish those bytes.
    If I call Close() at the end of ProcessRequest, then the message will be
    processed and the ack will be lost. This isn't so bad as I have other
    logic checking for duplicates, but I expect it would throw a nullref
    when it tries to ack in MainLoop.
    If I call Close() at the beginning of ProcessRequest, then the message
    will not be processed and the ack will be lost. This isn't so bad as we
    should have a redelivery, but I expect it would throw a nullref when it
    tries to ack in MainLoop.
    It would be useful to get a stack trace with line numbers to work out
    exactly where the NPE occurs.
    I'm afraid I might be a minor version behind at this point.
    System.NullReferenceException: Object reference not set to an instance of an object.
    at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag) in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\impl\ModelBase.cs:line 668
    at RabbitMQ.Client.MessagePatterns.Subscription.Close() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 223
    at RabbitMQ.Client.MessagePatterns.Subscription.System.IDisposable.Dispose() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 454
    at RabbitMQ.Client.MessagePatterns.SimpleRpcServer.MainLoop() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\SimpleRpcServer.cs:line 205

    Line 668 is:
    ModelShutdown -= new
    ModelShutdownEventHandler(k.m_consumer.HandleModelShutdown);

    I think I'm calling subscription.Close() multiple times, since my code
    is essentially:

    using(var subscription = new Subscription(model, "queue", false))
    using (RpcServer = new MyRpcServer(subscription, this)) {
    RpcServer.MainLoop();
    }

    So I think I'm calling subscription.Close() 3 times in 2 threads:
    # controller thread: calls RpcServer.Close()
    # worker thread: when the using() invokes subscription.Dispose()
    # worker thread: when the using() invokes RpcServer.Dispose()

    Subscription.Close() is checking if m_consumer is null before calling
    BasicCancel, so it looks like a race condition between the controlling
    thread calling RpcServer.Close() and the worker thread cleaning up with
    Dispose(), but I don't follow how the foreach in MainLoop would
    terminate and start Disposing of things before the controller thread had
    gotten through subscription.Close() to set m_consumer to null.

    I've got a unit test that reproduces this behavior, so I'll be playing
    with this today, stepping through to better understand what is happening.

    As a side note, the docs on
    http://www.rabbitmq.com/build-dotnet-client.html could use mention of
    the local.build file.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506
  • Ryan Davis at Jun 9, 2009 at 6:21 pm

    Ryan Davis wrote:
    I don't follow how the foreach in MainLoop would
    terminate and start Disposing of things before the controller thread had
    gotten through subscription.Close() to set m_consumer to null.

    I've got a unit test that reproduces this behavior, so I'll be playing
    with this today, stepping through to better understand what is happening.
    I think I figured it out. The C# "foreach" loop also acts like a
    "using" statement, and will call Dispose() on my subscription when it
    finishes, which I didn't realize
    (http://mark.michaelis.net/Blog/CForeachWithArrays.aspx). This means
    I'm actually calling Subscription.Close() a total of 4 times.

    Here's the chronology
    # controller thread: calls RpcServer.Close() -> Subscription.Close() ->
    Model.BasicCancel()
    # worker thread: throws and catches exceptions in Subscription.Next(),
    terminating MainLoop's foreach
    # worker thread: calls Subscription.Dispose() -> Subscription.Close() ->
    Model.BasicCancel()
    # controller thread: completes Model.BasicCancel(), sets some variables
    to null
    # worker thread: nullref trying to use those variables

    I added some Console.WriteLine statements to the Close/Dispose methods,
    and that solved the problem, so it's definitely a race condition when
    calling Close() from another thread.

    I'm going to add a flag and test out calling Close() from various points
    in HandleCall/HandleCast/ProcessRequest.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506
  • Ryan Davis at Jun 9, 2009 at 7:05 pm

    Ryan Davis wrote:
    I'm going to add a flag and test out calling Close() from various points
    in HandleCall/HandleCast/ProcessRequest.
    Wow, I really didn't think that one through.
    HandleCall/HandleCast/ProcessRequest are only called when there is a
    message to process. If there is no message to process, the
    SimpleRpcServer is blocked on Subscription.Next(), and will never
    exit. My current test case starts a SimpleRpcServer, then tries to
    stop it, without sending any messages.

    I changed my code so if the SimpleRpcServer sees a certain byte[]
    message (stored as ShutdownKey), it will close itself:

    public override void HandleCast(bool isRedelivered, IBasicProperties
    requestProperties, byte[] body)
    {
    for (var i = 0; i < body.Length; i++) {
    if (body[i] != ShutdownKey[i]) return;
    }
    Close(); //we must have matched, shut down.
    }

    As I expected, this throws a different exception:
    System.InvalidOperationException: Operation is not valid due to the current state of the object.
    at RabbitMQ.Client.MessagePatterns.Subscription.Next() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 323
    at RabbitMQ.Client.MessagePatterns.Subscription.System.Collections.IEnumerator.MoveNext() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 437
    at RabbitMQ.Client.MessagePatterns.SimpleRpcServer.MainLoop() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\SimpleRpcServer.cs:line 206

    I'm going to start mucking about in SimpleRpcServer.cs/Subscription.cs
    to see what's the smalled change I can make to fix it.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506
  • Ryan Davis at Jun 9, 2009 at 10:03 pm
    Ok, last one today, I promise.

    I solved this problem my adding some explicit locks in Subscription and
    SimpleRpcServer. I tried to checkout the latest source (on linux,
    cygwin, and tortoiseHG) but kept getting a zlib error, so this patch is
    against the source in rabbitmq-dotnet-client-1.5.3.zip downloaded from
    http://www.rabbitmq.com/dotnet.html.

    This patch makes SimpleRpcServer.Close threadsafe by adding:

    1. locking SimpleRpcServer.m_subscription around the body of
    SimpleRpcServer.MainLoop's foreach (so we can't close while
    processing)
    2. checking for a null Subscription.Consumer before processing a
    message in SimpleRpcServer.MainLoop, but after we've locked
    m_subscription (to ensure we don't process after we've closed)
    3. locking this around the body of Subscription.Close (to ensure we
    don't close that same subscription at the same time)

    I *think* that solves the potential race conditions. It certainly makes
    my tests work, and running "nant unit" on the 1.5.3 source gives 2
    failures in TestSharedQueue.cs that look timing-related.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506



    Ryan Davis wrote:
    Ryan Davis wrote:
    I'm going to add a flag and test out calling Close() from various points
    in HandleCall/HandleCast/ProcessRequest.
    Wow, I really didn't think that one through.
    HandleCall/HandleCast/ProcessRequest are only called when there is a
    message to process. If there is no message to process, the
    SimpleRpcServer is blocked on Subscription.Next(), and will never
    exit. My current test case starts a SimpleRpcServer, then tries to
    stop it, without sending any messages.

    I changed my code so if the SimpleRpcServer sees a certain byte[]
    message (stored as ShutdownKey), it will close itself:

    public override void HandleCast(bool isRedelivered, IBasicProperties
    requestProperties, byte[] body)
    {
    for (var i = 0; i < body.Length; i++) {
    if (body[i] != ShutdownKey[i]) return;
    }
    Close(); //we must have matched, shut down.
    }

    As I expected, this throws a different exception:
    System.InvalidOperationException: Operation is not valid due to the current state of the object.
    at RabbitMQ.Client.MessagePatterns.Subscription.Next() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 323
    at RabbitMQ.Client.MessagePatterns.Subscription.System.Collections.IEnumerator.MoveNext() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\Subscription.cs:line 437
    at RabbitMQ.Client.MessagePatterns.SimpleRpcServer.MainLoop() in d:\Projects\rabbitmq-dotnet-client-1.5.3\src\client\messagepatterns\SimpleRpcServer.cs:line 206

    I'm going to start mucking about in SimpleRpcServer.cs/Subscription.cs
    to see what's the smalled change I can make to fix it.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506


    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090609/a5f582cc/attachment.htm
    -------------- next part --------------
    An embedded and charset-unspecified text was scrubbed...
    Name: threadsafe-rpcserver.patch
    Url: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090609/a5f582cc/attachment.txt
  • Matthias Radestock at Jun 9, 2009 at 10:30 pm
    Ryan,

    Ryan Davis wrote:
    I solved this problem my adding some explicit locks in Subscription and
    SimpleRpcServer.
    Thanks for the patch. If you don't mind, we'll look into this in a few
    weeks when the person who wrote the original code is back from holidays.

    Some locking is most certainly required if you are going to call Close
    from a separate thread. So the question is whether we should cater for
    this in the first place (based on your findings I am pretty sure we need
    to), and if so whether the locks are in the right place.

    On the latter point, the locks look a bit too coarse-grained to me,
    covering activities that can take an indeterminate amount of time, such
    as interactions with the server. That may be unavoidable, but without
    understanding every single detail of the code it's hard to be sure.


    Regards,

    Matthias.
  • Ryan Davis at Jun 10, 2009 at 1:07 pm

    Matthias Radestock wrote:
    Thanks for the patch. If you don't mind, we'll look into this in a few
    weeks when the person who wrote the original code is back from holidays.
    Certainly. I'll put my patch into production today, and will send you a
    new patch if I run into problems.
    On the latter point, the locks look a bit too coarse-grained to me,
    covering activities that can take an indeterminate amount of time,
    such as interactions with the server. That may be unavoidable, but
    without understanding every single detail of the code it's hard to be
    sure.
    They are a bit heavy-handed. It is very possible that Subscription /
    SimpleRpcServer could be refactored to maintain thread safety with finer
    locks. I was wanting to make minimal changes, so I didn't go that
    route. I walked through it with another co-worker for a sanity check,
    and locking those large blocks seemed the only way to ensure correctness
    in these cases:

    1. calling Close() while the subscription is blocked waiting for a
    message
    2. calling Close() while the server is processing a message
    3. calling Close() after the subscription has fetched a message but
    before the server has started processing it

    I'm looking forward to hearing from the original author next month.

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506

    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090610/e9d87451/attachment.htm
  • Ryan Davis at Jul 16, 2009 at 6:02 pm

    Matthias Radestock wrote:
    Thanks for the patch. If you don't mind, we'll look into this in a few
    weeks when the person who wrote the original code is back from holidays.
    Has your .NET programmer had a chance to look at this patch yet? I have
    not had any problems running it in production (albeit with a light
    message load).

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506
  • Tony Garnock-Jones at Jul 17, 2009 at 12:06 pm
    Hi Ryan,

    Ryan Davis wrote:
    Has your .NET programmer had a chance to look at this patch yet? I have
    not had any problems running it in production (albeit with a light
    message load).
    I've taken a look at the issue this morning; things are moving at last.
    Sorry for the delay. I've just committed a proposed fix with a lighter
    locking pattern to branch bug20945; hopefully it will pass QA soon and
    be merged into default.

    Regards,
    Tony
    --
    [][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211
    [][] LShift Ltd | Tel: +44 (0)20 7729 7060
    [] [] http://www.lshift.net/ | Email: tonyg at lshift.net
  • Ryan Davis at Jul 20, 2009 at 3:27 pm

    Tony Garnock-Jones wrote:
    I've taken a look at the issue this morning; things are moving at last.
    Sorry for the delay. I've just committed a proposed fix with a lighter
    locking pattern to branch bug20945; hopefully it will pass QA soon and
    be merged into default.
    Great! No worries on the delay, hope you had a great holiday. I'd like
    to clarify where I'm supposed to call SimpleRpcServer.Close(). Looking
    at the Shutdownable examples in branch bug20945, it looks like I'm
    supposed to send the server a special "shut me down" AMQP message, and
    have the server call Close(). Is that correct?

    Thanks,

    Ryan Davis
    Acceleration.net
    Director of Programming Services
    2831 NW 41st street, suite B
    Gainesville, FL 32606

    Office: 352-335-6500 x 124
    Fax: 352-335-6506
  • Tony Garnock-Jones at Jul 20, 2009 at 3:49 pm
    Hi Ryan,

    Ryan Davis wrote:
    Great! No worries on the delay, hope you had a great holiday. I'd like
    to clarify where I'm supposed to call SimpleRpcServer.Close(). Looking
    at the Shutdownable examples in branch bug20945, it looks like I'm
    supposed to send the server a special "shut me down" AMQP message, and
    have the server call Close(). Is that correct?
    The example has two styles of call to Close: one from within the
    server's own mainloop, as you describe, but also one in a detached
    thread sitting outside the mainloop. It seems important to me to be able
    to robustly shut down a Subscription from both within and without. So
    your original example should work fine, too.

    Tony

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedJun 5, '09 at 2:20p
activeJul 20, '09 at 3:49p
posts12
users3
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2023 Grokbase