We've let loose one of our testing ninjas on RabbitMQ for load testing, and
we're consistently running into issues when the high memory watermark is
hit.



Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



2,000 Consumers each with their own queue bound to a direct exchange

1 Producer, publishing a 2 MB message to the exchange, once every second,
for a total of 50 seconds



Everything behaves as expected, until the memory footprint hits the high
watermark, at which point:

On a physical machine: ERL process crashes and dump file is created

On a Virtual Machine: Blue Screen of Death is shown and server reboots



VM environment = VMware, Inc.R vCenter Lab Manager 4.0 (4.0.3.1318)



One other note is that we see the same problem with ERL R14B04 and Rabbit
2.7.0.



I have looked through the log file and also turned on the console debug
output, and nothing seems to be jumping out as an error. If needed, I can
upload the minidump from the Blue Screen and the ERL crash dump file, just
point me where to do it.



Let me know if there is anything else I can do to try and help get this
fixed.







In the rabbit log, there are no errors, and only a few warnings 20 seconds
before the crash:



=INFO REPORT==== 11-Jan-2012::10:55:53 ===

closing TCP connection <0.4405.0> from 10.6.64.104:57830



=WARNING REPORT==== 11-Jan-2012::10:55:53 ===

exception on TCP connection <0.20552.0> from 10.6.64.104:59521

connection_closed_abruptly





In the console output log file for the physical machine, this is the only
message I see:



starting direct_client
...done

starting notify cluster nodes
...done



broker running

Eshell V5.9 (abort with ^G)

(rabbit at QEDLP082)1>

Crash dump was written to: C:/Documents and
Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

in message_loop

win32sysinfo:Erlang has closed.





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120111/e1621720/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 7172 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120111/e1621720/attachment.bin>

Search Discussions

  • Jerry Kuch at Jan 11, 2012 at 6:43 pm
    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • James Poole at Jan 11, 2012 at 6:51 pm
    Yeah, I should have mentioned that we started out testing with the 64-bit version and found this issue... though the VM probably didn't have very much more memory than a 32-bit address space would provide. Then we backed down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120111/10a325a9/attachment.bin>
  • Jerry Kuch at Jan 11, 2012 at 7:16 pm
    Hi, James:

    I'm not sure we have a totally great way of handing off the files. If
    they're small enough just emailing me a .tar.gz-file or the like is great.
    Otherwise, we might consider DropBox or the like...

    Thanks for poking at it further.

    Out of curiosity, are you looking at deploying Rabbit into production on
    Windows, or is this mostly on dev/test machines? Anecdotally it seems like
    the various Linuxes get much wider production use than the other platforms,
    and thus at any moment there are more eyes prying and more hands exercising
    them...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "James Poole" <james.poole at rsa.com>
    To: "Jerry Kuch (VMware)" <jerryk at vmware.com>
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:51:42 AM
    Subject: RE: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Yeah, I should have mentioned that we started out testing with the 64-bit version and found this issue... though the VM probably didn't have very much more memory than a 32-bit address space would provide. Then we backed down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • James Poole at Jan 12, 2012 at 1:54 pm
    I am in the process of getting the tool to reproduce and will email it to you when it's available.

    We were planning on deploying in production on Windows, though if this is not something that is recommended, I would definitely like to know about that now :)

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 2:17 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Hi, James:

    I'm not sure we have a totally great way of handing off the files. If
    they're small enough just emailing me a .tar.gz-file or the like is great.
    Otherwise, we might consider DropBox or the like...

    Thanks for poking at it further.

    Out of curiosity, are you looking at deploying Rabbit into production on
    Windows, or is this mostly on dev/test machines? Anecdotally it seems like
    the various Linuxes get much wider production use than the other platforms,
    and thus at any moment there are more eyes prying and more hands exercising
    them...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "James Poole" <james.poole at rsa.com>
    To: "Jerry Kuch (VMware)" <jerryk at vmware.com>
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:51:42 AM
    Subject: RE: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Yeah, I should have mentioned that we started out testing with the 64-bit version and found this issue... though the VM probably didn't have very much more memory than a 32-bit address space would provide. Then we backed down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/44a8df3f/attachment.bin>
  • Simone Busoli at Jan 11, 2012 at 8:01 pm
    Hi James,

    If you can provide more details about the load you're applying to the
    broker I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and
    didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if you
    use transactions or publisher confirms and the like.
    On Jan 11, 2012 7:52 PM, wrote:

    Yeah, I should have mentioned that we started out testing with the 64-bit
    version and found this issue... though the VM probably didn't have very
    much more memory than a 32-bit address space would provide. Then we backed
    down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java
    app using the rabbitmq-java-client-2.7.1). If I can send it out, how would
    I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing,
    and we?re consistently running into issues when the high memory watermark
    is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second,
    for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high
    watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit
    2.7.0.



    I have looked through the log file and also turned on the console debug
    output, and nothing seems to be jumping out as an error. If needed, I can
    upload the minidump from the Blue Screen and the ERL crash dump file, just
    point me where to do it.



    Let me know if there is anything else I can do to try and help get this
    fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds
    before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only
    message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and
    Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120111/f7ea5be2/attachment.htm>
  • James Poole at Jan 12, 2012 at 5:55 pm
    Simone, that would be great if you could try to reproduce it.



    As mentioned, we are creating 2000 consumers each with their own queue bound
    to a fanout exchange. After the queues have all been created and bound, a
    producer publishes a 2 MB message to this fanout exchange once every second
    for 50 seconds.



    All queues are non-durable. And autoAck was set to false in the Java
    client.



    Everything hums along until the vm_memory_high_watermark is triggered and
    then we see the crash. One interesting thing is that in the log it still
    shows it accepting and starting tcp connections after the memory alarm is
    triggered (for around 15 seconds before the crash). I thought this was
    supposed to block until the memory was under control?



    -James



    From: Simone Busoli [mailto:simone.busoli at gmail.com]
    Sent: Wednesday, January 11, 2012 3:02 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load



    Hi James,

    If you can provide more details about the load you're applying to the broker
    I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and
    didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if you
    use transactions or publisher confirms and the like.

    On Jan 11, 2012 7:52 PM, wrote:

    Yeah, I should have mentioned that we started out testing with the 64-bit
    version and found this issue... though the VM probably didn't have very much
    more memory than a 32-bit address space would provide. Then we backed down
    to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app
    using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go
    about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under
    Load





    We've let loose one of our testing ninjas on RabbitMQ for load testing, and
    we're consistently running into issues when the high memory watermark is
    hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second,
    for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high
    watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.R vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit
    2.7.0.



    I have looked through the log file and also turned on the console debug
    output, and nothing seems to be jumping out as an error. If needed, I can
    upload the minidump from the Blue Screen and the ERL crash dump file, just
    point me where to do it.



    Let me know if there is anything else I can do to try and help get this
    fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds
    before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only
    message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and
    Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/d8747951/attachment.htm>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/d8747951/attachment.bin>
  • Simone Busoli at Jan 12, 2012 at 6:02 pm
    Are you acking messages? If you don't, those messages will have to be
    stored somewhere and 2000 * 2MB * 1/s = 4 GB/s
    On Thu, Jan 12, 2012 at 18:55, wrote:

    Simone, that would be great if you could try to reproduce it.****

    ** **

    As mentioned, we are creating 2000 consumers each with their own queue
    bound to a fanout exchange. After the queues have all been created and
    bound, a producer publishes a 2 MB message to this fanout exchange once
    every second for 50 seconds.****

    ** **

    All queues are non-durable. And autoAck was set to false in the Java
    client.****

    ** **

    Everything hums along until the vm_memory_high_watermark is triggered and
    then we see the crash. One interesting thing is that in the log it still
    shows it accepting and starting tcp connections after the memory alarm is
    triggered (for around 15 seconds before the crash). I thought this was
    supposed to block until the memory was under control?****

    ** **

    -James****

    ** **

    *From:* Simone Busoli [mailto:simone.busoli at gmail.com]
    *Sent:* Wednesday, January 11, 2012 3:02 PM
    *To:* Poole, James
    *Cc:* rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)

    *Subject:* Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue
    Screens under Load****

    ** **

    Hi James,****

    If you can provide more details about the load you're applying to the
    broker I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and
    didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if
    you use transactions or publisher confirms and the like.****

    On Jan 11, 2012 7:52 PM, wrote:****

    Yeah, I should have mentioned that we started out testing with the 64-bit
    version and found this issue... though the VM probably didn't have very
    much more memory than a 32-bit address space would provide. Then we backed
    down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java
    app using the rabbitmq-java-client-2.7.1). If I can send it out, how would
    I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing,
    and we?re consistently running into issues when the high memory watermark
    is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second,
    for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high
    watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit
    2.7.0.



    I have looked through the log file and also turned on the console debug
    output, and nothing seems to be jumping out as an error. If needed, I can
    upload the minidump from the Blue Screen and the ERL crash dump file, just
    point me where to do it.



    Let me know if there is anything else I can do to try and help get this
    fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds
    before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only
    message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and
    Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss****
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/c025f33d/attachment.htm>
  • James Poole at Jan 12, 2012 at 6:17 pm
    We are sending the Ack from the consumer side after the message is
    delivered. I will try setting autoAck to true and see if that makes a
    difference.



    I thought Rabbit was smart enough to just keep one copy of a message for a
    fanout? Does it really create 2000 separate copies of the same message in
    memory, or does it just do this when the message is being sent?



    -James



    From: Simone Busoli [mailto:simone.busoli at gmail.com]
    Sent: Thursday, January 12, 2012 1:03 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load



    Are you acking messages? If you don't, those messages will have to be stored
    somewhere and 2000 * 2MB * 1/s = 4 GB/s

    On Thu, Jan 12, 2012 at 18:55, wrote:

    Simone, that would be great if you could try to reproduce it.



    As mentioned, we are creating 2000 consumers each with their own queue bound
    to a fanout exchange. After the queues have all been created and bound, a
    producer publishes a 2 MB message to this fanout exchange once every second
    for 50 seconds.



    All queues are non-durable. And autoAck was set to false in the Java
    client.



    Everything hums along until the vm_memory_high_watermark is triggered and
    then we see the crash. One interesting thing is that in the log it still
    shows it accepting and starting tcp connections after the memory alarm is
    triggered (for around 15 seconds before the crash). I thought this was
    supposed to block until the memory was under control?



    -James



    From: Simone Busoli [mailto:simone.busoli at gmail.com]
    Sent: Wednesday, January 11, 2012 3:02 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)


    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load



    Hi James,

    If you can provide more details about the load you're applying to the broker
    I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and
    didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if you
    use transactions or publisher confirms and the like.

    On Jan 11, 2012 7:52 PM, wrote:

    Yeah, I should have mentioned that we started out testing with the 64-bit
    version and found this issue... though the VM probably didn't have very much
    more memory than a 32-bit address space would provide. Then we backed down
    to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app
    using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go
    about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under
    Load





    We've let loose one of our testing ninjas on RabbitMQ for load testing, and
    we're consistently running into issues when the high memory watermark is
    hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second,
    for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high
    watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.R vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit
    2.7.0.



    I have looked through the log file and also turned on the console debug
    output, and nothing seems to be jumping out as an error. If needed, I can
    upload the minidump from the Blue Screen and the ERL crash dump file, just
    point me where to do it.



    Let me know if there is anything else I can do to try and help get this
    fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds
    before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only
    message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and
    Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/cb3bc629/attachment.htm>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/cb3bc629/attachment.bin>
  • Simone Busoli at Jan 12, 2012 at 6:22 pm
    I'm honestly not sure, please take my last mail with a grain of salt, I
    might have hit send too fast. I always assumed that each queue would get
    its own copy of a message, but in fact it's totally sensible for RabbitMQ
    to keep only one copy of the message in such case.
    The devs will be able to shed some more light on this, I just meant to
    point out that big unacked messages may lead to problems as they need to be
    stored somewhere.
    On Thu, Jan 12, 2012 at 19:17, wrote:

    We are sending the Ack from the consumer side after the message is
    delivered. I will try setting autoAck to true and see if that makes a
    difference.****

    ** **

    I thought Rabbit was smart enough to just keep one copy of a message for a
    fanout? Does it really create 2000 separate copies of the same message in
    memory, or does it just do this when the message is being sent?****

    ** **

    -James****

    ** **

    *From:* Simone Busoli [mailto:simone.busoli at gmail.com]
    *Sent:* Thursday, January 12, 2012 1:03 PM

    *To:* Poole, James
    *Cc:* rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)
    *Subject:* Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue
    Screens under Load****

    ** **

    Are you acking messages? If you don't, those messages will have to be
    stored somewhere and 2000 * 2MB * 1/s = 4 GB/s****

    On Thu, Jan 12, 2012 at 18:55, wrote:****

    Simone, that would be great if you could try to reproduce it.****

    ****

    As mentioned, we are creating 2000 consumers each with their own queue
    bound to a fanout exchange. After the queues have all been created and
    bound, a producer publishes a 2 MB message to this fanout exchange once
    every second for 50 seconds.****

    ****

    All queues are non-durable. And autoAck was set to false in the Java
    client.****

    ****

    Everything hums along until the vm_memory_high_watermark is triggered and
    then we see the crash. One interesting thing is that in the log it still
    shows it accepting and starting tcp connections after the memory alarm is
    triggered (for around 15 seconds before the crash). I thought this was
    supposed to block until the memory was under control?****

    ****

    -James****

    ****

    *From:* Simone Busoli [mailto:simone.busoli at gmail.com]
    *Sent:* Wednesday, January 11, 2012 3:02 PM
    *To:* Poole, James
    *Cc:* rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)****


    *Subject:* Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue
    Screens under Load****

    ****

    Hi James,****

    If you can provide more details about the load you're applying to the
    broker I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and
    didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if
    you use transactions or publisher confirms and the like.****

    On Jan 11, 2012 7:52 PM, wrote:****

    Yeah, I should have mentioned that we started out testing with the 64-bit
    version and found this issue... though the VM probably didn't have very
    much more memory than a 32-bit address space would provide. Then we backed
    down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java
    app using the rabbitmq-java-client-2.7.1). If I can send it out, how would
    I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
    under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing,
    and we?re consistently running into issues when the high memory watermark
    is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second,
    for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high
    watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit
    2.7.0.



    I have looked through the log file and also turned on the console debug
    output, and nothing seems to be jumping out as an error. If needed, I can
    upload the minidump from the Blue Screen and the ERL crash dump file, just
    point me where to do it.



    Let me know if there is anything else I can do to try and help get this
    fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds
    before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only
    message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and
    Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss****

    ** **
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/8cd47e0c/attachment.htm>
  • Jerry Kuch at Jan 12, 2012 at 11:37 pm
    Simone and James:

    James's assumption is generally right. Fanout does *not* require actual,
    literal copies of a message to exist for all of the receiving queues bound
    to a fanout exchange....

    Best regards,
    Jerry

    ----- Original Message -----
    From: "Simone Busoli" <simone.busoli at gmail.com>
    To: "james poole" <james.poole at rsa.com>
    Cc: rabbitmq-discuss at lists.rabbitmq.com, jerryk at vmware.com
    Sent: Thursday, January 12, 2012 10:22:22 AM
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    I'm honestly not sure, please take my last mail with a grain of salt, I might have hit send too fast. I always assumed that each queue would get its own copy of a message, but in fact it's totally sensible for RabbitMQ to keep only one copy of the message in such case.
    The devs will be able to shed some more light on this, I just meant to point out that big unacked messages may lead to problems as they need to be stored somewhere.


    On Thu, Jan 12, 2012 at 19:17, wrote:






    We are sending the Ack from the consumer side after the message is delivered. I will try setting autoAck to true and see if that makes a difference.



    I thought Rabbit was smart enough to just keep one copy of a message for a fanout? Does it really create 2000 separate copies of the same message in memory, or does it just do this when the message is being sent?



    -James




    From: Simone Busoli [mailto: simone.busoli at gmail.com ]
    Sent: Thursday, January 12, 2012 1:03 PM


    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com ; Kuch, Jerry (VMware)
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load







    Are you acking messages? If you don't, those messages will have to be stored somewhere and 2000 * 2MB * 1/s = 4 GB/s


    On Thu, Jan 12, 2012 at 18:55, wrote:



    Simone, that would be great if you could try to reproduce it.



    As mentioned, we are creating 2000 consumers each with their own queue bound to a fanout exchange. After the queues have all been created and bound, a producer publishes a 2 MB message to this fanout exchange once every second for 50 seconds.



    All queues are non-durable. And autoAck was set to false in the Java client.



    Everything hums along until the vm_memory_high_watermark is triggered and then we see the crash. One interesting thing is that in the log it still shows it accepting and starting tcp connections after the memory alarm is triggered (for around 15 seconds before the crash). I thought this was supposed to block until the memory was under control?



    -James




    From: Simone Busoli [mailto: simone.busoli at gmail.com ]
    Sent: Wednesday, January 11, 2012 3:02 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com ; Kuch, Jerry (VMware)




    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    Hi James,

    If you can provide more details about the load you're applying to the broker I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if you use transactions or publisher confirms and the like.


    On Jan 11, 2012 7:52 PM, wrote:

    Yeah, I should have mentioned that we started out testing with the 64-bit version and found this issue... though the VM probably didn't have very much more memory than a 32-bit address space would provide. Then we backed down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto: jerryk at vmware.com ]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" < james.poole at rsa.com >
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Simon MacMullen at Jan 13, 2012 at 11:47 am

    On 12/01/12 23:37, Jerry Kuch wrote:
    James's assumption is generally right. Fanout does *not* require actual,
    literal copies of a message to exist for all of the receiving queues bound
    to a fanout exchange....
    It's not just fanout exchanges. If a message gets delivered to more than
    one queue we will only persist one copy of the body to disc, and will
    only have one copy of the body in RAM (umm, if it's > 64 bytes in length).

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware
  • James Poole at Jan 13, 2012 at 2:25 pm
    Just to follow up, using autoAck=true did not resolve the crash. I will continue trying a few different scenarios and I am waiting for the test engineer to give me a stripped down java program to reproduce the issue. I'll send it out when I get it.

    -James


    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Thursday, January 12, 2012 6:37 PM
    To: Simone Busoli
    Cc: rabbitmq-discuss at lists.rabbitmq.com; Poole, James
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Simone and James:

    James's assumption is generally right. Fanout does *not* require actual,
    literal copies of a message to exist for all of the receiving queues bound
    to a fanout exchange....

    Best regards,
    Jerry

    ----- Original Message -----
    From: "Simone Busoli" <simone.busoli at gmail.com>
    To: "james poole" <james.poole at rsa.com>
    Cc: rabbitmq-discuss at lists.rabbitmq.com, jerryk at vmware.com
    Sent: Thursday, January 12, 2012 10:22:22 AM
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    I'm honestly not sure, please take my last mail with a grain of salt, I might have hit send too fast. I always assumed that each queue would get its own copy of a message, but in fact it's totally sensible for RabbitMQ to keep only one copy of the message in such case.
    The devs will be able to shed some more light on this, I just meant to point out that big unacked messages may lead to problems as they need to be stored somewhere.


    On Thu, Jan 12, 2012 at 19:17, wrote:






    We are sending the Ack from the consumer side after the message is delivered. I will try setting autoAck to true and see if that makes a difference.



    I thought Rabbit was smart enough to just keep one copy of a message for a fanout? Does it really create 2000 separate copies of the same message in memory, or does it just do this when the message is being sent?



    -James




    From: Simone Busoli [mailto: simone.busoli at gmail.com ]
    Sent: Thursday, January 12, 2012 1:03 PM


    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com ; Kuch, Jerry (VMware)
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load







    Are you acking messages? If you don't, those messages will have to be stored somewhere and 2000 * 2MB * 1/s = 4 GB/s


    On Thu, Jan 12, 2012 at 18:55, wrote:



    Simone, that would be great if you could try to reproduce it.



    As mentioned, we are creating 2000 consumers each with their own queue bound to a fanout exchange. After the queues have all been created and bound, a producer publishes a 2 MB message to this fanout exchange once every second for 50 seconds.



    All queues are non-durable. And autoAck was set to false in the Java client.



    Everything hums along until the vm_memory_high_watermark is triggered and then we see the crash. One interesting thing is that in the log it still shows it accepting and starting tcp connections after the memory alarm is triggered (for around 15 seconds before the crash). I thought this was supposed to block until the memory was under control?



    -James




    From: Simone Busoli [mailto: simone.busoli at gmail.com ]
    Sent: Wednesday, January 11, 2012 3:02 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com ; Kuch, Jerry (VMware)




    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    Hi James,

    If you can provide more details about the load you're applying to the broker I would be glad to try to reproduce it.
    We've been using RabbitMQ on Windows in production for some months now and didn't experience any weird behavior.
    What I'm interested in is whether entities and messages are durable, if you use transactions or publisher confirms and the like.


    On Jan 11, 2012 7:52 PM, wrote:

    Yeah, I should have mentioned that we started out testing with the 64-bit version and found this issue... though the VM probably didn't have very much more memory than a 32-bit address space would provide. Then we backed down to the 32-bit version to see if it went away, but it didn't.

    I will see if we can send out the test program (it's just a simple java app using the rabbitmq-java-client-2.7.1). If I can send it out, how would I go about this... attach to the email or upload it to a server somewhere?

    -James

    -----Original Message-----
    From: Jerry Kuch [mailto: jerryk at vmware.com ]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" < james.poole at rsa.com >
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120113/12ff26bf/attachment.bin>
  • Simon MacMullen at Jan 13, 2012 at 11:49 am

    On 12/01/12 17:55, james.poole at rsa.com wrote:
    Everything hums along until the vm_memory_high_watermark is triggered
    and then we see the crash. One interesting thing is that in the log it
    still shows it accepting and starting tcp connections after the memory
    alarm is triggered (for around 15 seconds before the crash). I thought
    this was supposed to block until the memory was under control?
    It won't refuse connections. The idea is that it should accept
    connections, but then block them if they try to publish. If you want to
    connect to consume messages you should be allowed to even after the
    alarm is triggered...

    Cheers, Simon

    --
    Simon MacMullen
    RabbitMQ, VMware
  • James Poole at Jan 13, 2012 at 8:44 pm
    Jerry,

    I have modified the EmitLog.java and ReceiveLogs.java from the tutorials on the website to reproduce the crash (attached). If the mailing list strips these attachments out, just ping me if anyone wants a copy and I'll send them directly.

    Both files will need to be modified to change the address on the factory.setHost() call to your specific broker, and you will need to pass the path to a 2 MB+ file as an argument to the EmitLog process.

    Thanks for looking into this.

    -James



    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of
    Erlang for Windows in your environment? The address space size
    limitations of the 32-bit version have been associated with crashy
    Rabbits in the past (although bringing your memory high watermark
    value down so that the back-pressure mechanisms engage when the
    broker is in less trouble may help). I think you can scare up the
    new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on
    64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine
    and a virtualized one, with one showing a "clean" Erlang VM crash and
    the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a
    large or proprietary app, or can you extract a bare minimum piece
    of code that brings the pain and share it with us? If you could
    do the latter we could more easily investigate the situation within
    VMware since the difference in behavior between baremetal and
    virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: EmitLog.java
    Type: text/java
    Size: 2276 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120113/4e2d18a2/attachment.bin>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: ReceiveLogs.java
    Type: text/java
    Size: 1630 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120113/4e2d18a2/attachment-0001.bin>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120113/4e2d18a2/attachment-0002.bin>
  • James Poole at Jan 18, 2012 at 8:52 pm
    Has anyone had a chance to investigate this crash? I can re-send the repro source files if needed.

    Thanks,
    James

    -----Original Message-----
    From: Poole, James
    Sent: Friday, January 13, 2012 3:45 PM
    To: Kuch, Jerry (VMware)
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: RE: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Jerry,

    I have modified the EmitLog.java and ReceiveLogs.java from the tutorials on the website to reproduce the crash (attached). If the mailing list strips these attachments out, just ping me if anyone wants a copy and I'll send them directly.

    Both files will need to be modified to change the address on the factory.setHost() call to your specific broker, and you will need to pass the path to a 2 MB+ file as an argument to the EmitLog process.

    Thanks for looking into this.

    -James



    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of Erlang for Windows in your environment? The address space size limitations of the 32-bit version have been associated with crashy Rabbits in the past (although bringing your memory high watermark value down so that the back-pressure mechanisms engage when the broker is in less trouble may help). I think you can scare up the new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on 64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine and a virtualized one, with one showing a "clean" Erlang VM crash and the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a large or proprietary app, or can you extract a bare minimum piece of code that brings the pain and share it with us? If you could do the latter we could more easily investigate the situation within VMware since the difference in behavior between baremetal and virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 7172 bytes
    Desc: not available
    URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120118/0562966d/attachment.bin>
  • Jerry Kuch at Jan 18, 2012 at 9:08 pm
    Hi, James...

    Sorry, not yet meaningfully but it's on my list to hopefully get
    to in the next couple of days...

    Jerry

    ----- Original Message -----
    From: "James Poole" <james.poole at rsa.com>
    To: "James Poole" <james.poole at rsa.com>, "Jerry Kuch (VMware)" <jerryk at vmware.com>
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 18, 2012 12:52:51 PM
    Subject: RE: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Has anyone had a chance to investigate this crash? I can re-send the repro source files if needed.

    Thanks,
    James

    -----Original Message-----
    From: Poole, James
    Sent: Friday, January 13, 2012 3:45 PM
    To: Kuch, Jerry (VMware)
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: RE: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    Jerry,

    I have modified the EmitLog.java and ReceiveLogs.java from the tutorials on the website to reproduce the crash (attached). If the mailing list strips these attachments out, just ping me if anyone wants a copy and I'll send them directly.

    Both files will need to be modified to change the address on the factory.setHost() call to your specific broker, and you will need to pass the path to a 2 MB+ file as an argument to the EmitLog process.

    Thanks for looking into this.

    -James



    -----Original Message-----
    From: Jerry Kuch [mailto:jerryk at vmware.com]
    Sent: Wednesday, January 11, 2012 1:44 PM
    To: Poole, James
    Cc: rabbitmq-discuss at lists.rabbitmq.com
    Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

    James: Out of curiousity have you tried the new 64-bit release of Erlang for Windows in your environment? The address space size limitations of the 32-bit version have been associated with crashy Rabbits in the past (although bringing your memory high watermark value down so that the back-pressure mechanisms engage when the broker is in less trouble may help). I think you can scare up the new Erlang here:

    http://www.erlang.org/download/otp_win64_R15B.exe

    Until recently there was no 64-bit Erlang, so even those running on 64-bit Windows boxes were still relegated to 32-bit VMs.

    I am curious about the different results between a physical machine and a virtualized one, with one showing a "clean" Erlang VM crash and the other exhibiting a blue-screen, fatal OS-wrecker...

    Is the traffic you're using to bring these systems down part of a large or proprietary app, or can you extract a bare minimum piece of code that brings the pain and share it with us? If you could do the latter we could more easily investigate the situation within VMware since the difference in behavior between baremetal and virtualization is disquieting...

    Best regards,
    Jerry

    ----- Original Message -----
    From: "james poole" <james.poole at rsa.com>
    To: rabbitmq-discuss at lists.rabbitmq.com
    Sent: Wednesday, January 11, 2012 10:32:23 AM
    Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load





    We?ve let loose one of our testing ninjas on RabbitMQ for load testing, and we?re consistently running into issues when the high memory watermark is hit.



    Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



    2,000 Consumers each with their own queue bound to a direct exchange

    1 Producer, publishing a 2 MB message to the exchange, once every second, for a total of 50 seconds



    Everything behaves as expected, until the memory footprint hits the high watermark, at which point:

    On a physical machine: ERL process crashes and dump file is created

    On a Virtual Machine: Blue Screen of Death is shown and server reboots



    VM environment = VMware, Inc.? vCenter Lab Manager 4.0 (4.0.3.1318)



    One other note is that we see the same problem with ERL R14B04 and Rabbit 2.7.0.



    I have looked through the log file and also turned on the console debug output, and nothing seems to be jumping out as an error. If needed, I can upload the minidump from the Blue Screen and the ERL crash dump file, just point me where to do it.



    Let me know if there is anything else I can do to try and help get this fixed.







    In the rabbit log, there are no errors, and only a few warnings 20 seconds before the crash:



    =INFO REPORT==== 11-Jan-2012::10:55:53 ===

    closing TCP connection <0.4405.0> from 10.6.64.104:57830



    =WARNING REPORT==== 11-Jan-2012::10:55:53 ===

    exception on TCP connection <0.20552.0> from 10.6.64.104:59521

    connection_closed_abruptly





    In the console output log file for the physical machine, this is the only message I see:



    starting direct_client ...done

    starting notify cluster nodes ...done



    broker running

    Eshell V5.9 (abort with ^G)

    (rabbit at QEDLP082)1>

    Crash dump was written to: C:/Documents and Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

    eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

    in message_loop

    win32sysinfo:Erlang has closed.




    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-discuss at lists.rabbitmq.com
    https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedJan 11, '12 at 6:32p
activeJan 18, '12 at 9:08p
posts17
users4
websiterabbitmq.com
irc#rabbitmq

People

Translate

site design / logo © 2022 Grokbase