Hi,


After getting a small test cluster to run successfully, I am now
getting an unexpected Mnesia error on one of the nodes. Here is a
snippet of the messages in my RabbitMQ log for the node rabbit at LAKE-
HP165:

Mnesia('rabbit at LAKE-HP165'): ** ERROR ** Mnesia on 'rabbit at LAKE-HP165'
could not connect to node(s) ['rabbit at LAKE-DS140']

This message is apparently being generated about every 30 seconds!


The cluster seemed to work initially, i.e. I could send out a message
from any node to a pub/sub exchange, and each queue/node properly
received the message. But after that initial test the RabbitMQ
install on the rabbit at LAKE-HP165 does not seem to work, none of the
other nodes can see it if I do a rabbitmqctl -n ... status, the queue
that I created for this node does not show up in the management
console, when I do a rabbitmqctl status on the problem node, none of
the plugins show up, yet they did show up when I initially instaleld
Rabbit, etc.


Two of the nodes in the cluster are dual NIC machines, i.e. the PCs
sit on two different networks (i.e. one NIC is mapped to 172.x.x.x,
other NIC is mapped to 10.x.x.x). The node/PC that seems to have
failed has two NICs, but one is permanent (i.e. a true NIC card mapped
to 172.x.x.x) and one (mapped to 10.x.x.x) is via a VPN connection
that I connect/disconnect to as needed.

So, quick questions:
1) Is the dual NIC setup of the nodes the cause of the problem (I
realize that a cluster is not designed to work over a partitioned
network)

2) I know that RabbitMQ/erlang can be 'pinned' to a specific IP
address.port. Would this be a solution if the dual NICS are the
problem?

OR

Is it just not safe to use a cluster on a dual NIC node (i.e. do I
have to use a federation?)

Well, thanks in advance for any help

Regards
David



I've got a small test cluster of three nodes. Each node is is Windows
XP machine. Each node has two NICs, one NIC is mapped to our
development environment network, the other NIC is mapped to our
production network.

Search Discussions

  • Emile Joubert at Feb 10, 2012 at 2:33 pm
    Hi,
    On 07/02/12 15:02, davidib wrote:
    Mnesia('rabbit at LAKE-HP165'): ** ERROR ** Mnesia on 'rabbit at LAKE-HP165'
    could not connect to node(s) ['rabbit at LAKE-DS140']
    In the other message thread with a similar subject ("rabbitmqctl -n
    ..... status fails one way but not the other") you expressed your
    intention to set up firewalls correctly to allow clustering. When you do
    that can you confirm that this problem is also solved? It seems likely
    that they both share the same root cause.



    -Emile
  • Davidib at Feb 10, 2012 at 2:54 pm
    Hi Emile,

    Yes, I'll let you know if tweaking the firewall works okay.

    1.) I'm a bit concerned regarding the solution, again I can't find any
    documenation regarding port 1052. I very very much appreciate your
    help, a bit concerned regarding the amount of time it's taking to
    'research/discover' the nuances/gotchas. I realize that this comes
    with the territory, however we do have a project that has has to be
    completed by Aug 2012 that is currently based on RabbitMQ. My
    allocated time to get RabbitMQ up and running in a reasonably robust
    way is 4-6 weeks. As a rabbitmq newbie, do you think this timeframe
    is doable?

    2.)Just to make sure, regarding the firewall issue, all the hosts I
    plan to run rabbitmq on are all on the same physical switch, all share
    the same 10.8.xxx subnet, etc. The firewall (windows firewall) is up
    as a matter of standard configuration/practice. I don't know anyone
    who isn't running a firewall on a production server. If clustering
    really doesn't work well over the firewall, doesn't that imply it's
    usage is limited to a very tiny population of production
    infrastructures?

    3.)I'm following up on using a shovel approach vs a cluster, so
    possibly the cluster issues I've been having are moot for me. But in
    the spirit of community I'll followup with your request and let you
    know.

    Again Emile I really appreciate your help/feedback. Thanks again

    David




    On Feb 10, 8:33?am, Emile Joubert wrote:
    Hi,
    On 07/02/12 15:02, davidib wrote:

    Mnesia('rabbit at LAKE-HP165'): ** ERROR ** Mnesia on 'rabbit at LAKE-HP165'
    could not connect to node(s) ['rabbit at LAKE-DS140']
    In the other message thread with a similar subject ("rabbitmqctl -n
    ..... status fails one way but not the other") you expressed your
    intention to set up firewalls correctly to allow clustering. When you do
    that can you confirm that this problem is also solved? It seems likely
    that they both share the same root cause.

    -Emile
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Emile Joubert at Feb 10, 2012 at 3:28 pm
    Hi David,
    On 10/02/12 14:54, davidib wrote:
    1.) I'm a bit concerned regarding the solution, again I can't find any
    documenation regarding port 1052. I very very much appreciate your
    You won't find any documentation on that port number, because it was
    randomly assigned when your Erlang VM started up. If you reboot then it
    will change. At the time when this diagnostic information was produced
    the value was correct:

    Error: unable to connect to node 'rabbit at LAKE-DS140': nodedown
    diagnostics:
    - nodes and their ports on LAKE-DS140: [{rabbit,1052}]

    You can find the current set of ports by running "epmd -names". You will
    find epmd in the erts/bin folder of your Erlang installation.
    allocated time to get RabbitMQ up and running in a reasonably robust
    way is 4-6 weeks. As a rabbitmq newbie, do you think this timeframe
    is doable?
    Our aim is that you should have a single-node broker running in a few
    minutes after downloading. It should not be a complicated affair to set
    up. Clustering takes a bit more time to set up, so you should allow for
    that. I'm in no position to validate estimates though! Please let us
    have your suggestions for improvements to RabbitMQ that will help you
    reach your goal.
    who isn't running a firewall on a production server. If clustering
    really doesn't work well over the firewall, doesn't that imply it's
    usage is limited to a very tiny population of production
    infrastructures?
    Clustering works fine through a firewall, provided that the instructions
    here are followed:
    http://www.rabbitmq.com/clustering.html#firewall
    If you have followed these instructions and run into difficulties then
    please let us know.


    -Emile
  • Davidib at Feb 10, 2012 at 3:43 pm
    Hey Emile,

    Now it makes ALOT more sense! I thought I had taken care of this
    issue via explicitly opening port 4369. Didn't realize that the epmd
    port was randomly assigned. Looks like I can avoid the dynamic nature
    of the port assignment by using the ERL_EPMD_PORT environment
    variable!

    Thanks again
    David
    On Feb 10, 9:28?am, Emile Joubert wrote:
    Hi David,
    On 10/02/12 14:54, davidib wrote:

    1.) I'm a bit concerned regarding the solution, again I can't find any
    documenation regarding port 1052. ?I very very much appreciate your
    You won't find any documentation on that port number, because it was
    randomly assigned when your Erlang VM started up. If you reboot then it
    will change. At the time when this diagnostic information was produced
    the value was correct:

    Error: unable to connect to node 'rabbit at LAKE-DS140': nodedown
    diagnostics:
    - nodes and their ports on LAKE-DS140: [{rabbit,1052}]

    You can find the current set of ports by running "epmd -names". You will
    find epmd in the erts/bin folder of your Erlang installation.
    allocated time to get RabbitMQ up and running in a reasonably robust
    way is 4-6 weeks. ?As a rabbitmq newbie, do you think this timeframe
    is doable?
    Our aim is that you should have a single-node broker running in a few
    minutes after downloading. It should not be a complicated affair to set
    up. Clustering takes a bit more time to set up, so you should allow for
    that. I'm in no position to validate estimates though! Please let us
    have your suggestions for improvements to RabbitMQ that will help you
    reach your goal.
    who isn't running a firewall on a production server. ?If clustering
    really doesn't work well over the firewall, doesn't that imply it's
    usage is limited to a very tiny population of production
    infrastructures?
    Clustering works fine through a firewall, provided that the instructions
    here are followed:http://www.rabbitmq.com/clustering.html#firewall
    If you have followed these instructions and run into difficulties then
    please let us know.

    -Emile
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
  • Emile Joubert at Feb 10, 2012 at 3:57 pm
    Hi,
    On 10/02/12 15:43, davidib wrote:
    Now it makes ALOT more sense! I thought I had taken care of this
    issue via explicitly opening port 4369. Didn't realize that the epmd
    port was randomly assigned. Looks like I can avoid the dynamic nature
    of the port assignment by using the ERL_EPMD_PORT environment
    variable!
    That's not the problem. You need to allow connections to the port mapper
    daemon as well as to the Erlang nodes. The former is fixed (defaults to
    4369) but can be changed via ERL_EPMD_PORT. The latter is random and can
    be controlled via the inet_dist_listen_min and inet_dist_listen_max
    Erlang parameters.


    -Emile
  • Davidib at Feb 10, 2012 at 4:05 pm
    Hi Emile,

    ok, got it! It's starting to come together... it's starting to make
    alot of sense...

    Thanks again
    David
    On Feb 10, 9:57?am, Emile Joubert wrote:
    Hi,
    On 10/02/12 15:43, davidib wrote:

    Now it makes ALOT more sense! ?I thought I had taken care of this
    issue via explicitly opening port 4369. ?Didn't realize that the epmd
    port was randomly assigned. ?Looks like I can avoid the dynamic nature
    of the port assignment by using the ERL_EPMD_PORT environment
    variable!
    That's not the problem. You need to allow connections to the port mapper
    daemon as well as to the Erlang nodes. The former is fixed (defaults to
    4369) but can be changed via ERL_EPMD_PORT. The latter is random and can
    be controlled via the inet_dist_listen_min and inet_dist_listen_max
    Erlang parameters.

    -Emile
    _______________________________________________
    rabbitmq-discuss mailing list
    rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprabbitmq-discuss @
categoriesrabbitmq
postedFeb 7, '12 at 3:02p
activeFeb 10, '12 at 4:05p
posts7
users2
websiterabbitmq.com
irc#rabbitmq

2 users in discussion

Davidib: 4 posts Emile Joubert: 3 posts

People

Translate

site design / logo © 2017 Grokbase