As far as I know, this is the expected behavior. I'm pretty sure that the
public network won't factor in when resolving a split brain scenario (which
is what you're testing) - for OCFS, clusterware, or the database (any of
which could reboot the node in this situation - whoever times out first).
Really, this just shows how important it is that you protect your
interconnect by bonding multiple physical connections. It also illustrates
why the interconnect and public networks should not share *any* physical
components - NICs or switches. In a proper configuration it should take at
least a double-failure to produce this situation (although really everything
should have bonding for even further protection). There was another post on
the list within the past two or three days talking about sharing NICs for
the public net and the interconnect.
On 1/17/08, Ujang Jaenudin wrote:
I have a RAC environment which consists of 2 nodes (linux x86_64),
voting disk & OCR are on raw devices.
but all datafiles are on OCFS2
we have scenario for UAT which pulled up the cable (public & private)
from one node.
- when node2 cable was pulled up, the expected behavior coming, which
is node2 restart, node1 stayed running.
- when node1 cable was pulled up, the node2 restart and node1 stayed
running, we expect that the node1 should restart and node2 stayed
it seem that node1 become a master node (clusterware layer).
is this expected behavior or is there another configuration missed?