FAQ
all,

I have a RAC environment which consists of 2 nodes (linux x86_64), 10.2.0.3

voting disk & OCR are on raw devices.
but all datafiles are on OCFS2

we have scenario for UAT which pulled up the cable (public & private)
from one node.
- when node2 cable was pulled up, the expected behavior coming, which
is node2 restart, node1 stayed running.
- when node1 cable was pulled up, the node2 restart and node1 stayed
running, we expect that the node1 should restart and node2 stayed
running.

it seem that node1 become a master node (clusterware layer).

is this expected behavior or is there another configuration missed?

Search Discussions

  • Jeremy Schneider at Jan 18, 2008 at 6:52 pm
    As far as I know, this is the expected behavior. I'm pretty sure that the
    public network won't factor in when resolving a split brain scenario (which
    is what you're testing) - for OCFS, clusterware, or the database (any of
    which could reboot the node in this situation - whoever times out first).

    Really, this just shows how important it is that you protect your
    interconnect by bonding multiple physical connections. It also illustrates
    why the interconnect and public networks should not share *any* physical
    components - NICs or switches. In a proper configuration it should take at
    least a double-failure to produce this situation (although really everything
    should have bonding for even further protection). There was another post on
    the list within the past two or three days talking about sharing NICs for
    the public net and the interconnect.

    -Jeremy
    On 1/17/08, Ujang Jaenudin wrote:

    all,

    I have a RAC environment which consists of 2 nodes (linux x86_64),
    10.2.0.3

    voting disk & OCR are on raw devices.
    but all datafiles are on OCFS2

    we have scenario for UAT which pulled up the cable (public & private)
    from one node.
    - when node2 cable was pulled up, the expected behavior coming, which
    is node2 restart, node1 stayed running.
    - when node1 cable was pulled up, the node2 restart and node1 stayed
    running, we expect that the node1 should restart and node2 stayed
    running.

    it seem that node1 become a master node (clusterware layer).

    is this expected behavior or is there another configuration missed?



    --
    regards
    ujang
    --
    http://www.freelists.org/webpage/oracle-l

    --
    Jeremy Schneider
    Chicago, IL
    http://www.ardentperf.com/category/technical

    --
    http://www.freelists.org/webpage/oracle-l
  • LS Cheng at Jan 18, 2008 at 7:18 pm
    hi

    its expected behaviour, when the cluster is divided into sub clusters with
    same number of nodes, i.e one node per sub cluster in a two node cluster
    configuration, the lower node always survives

    thanks

    --
    LSC
    On Jan 17, 2008 1:41 PM, Ujang Jaenudin wrote:

    all,

    I have a RAC environment which consists of 2 nodes (linux x86_64),
    10.2.0.3

    voting disk & OCR are on raw devices.
    but all datafiles are on OCFS2

    we have scenario for UAT which pulled up the cable (public & private)
    from one node.
    - when node2 cable was pulled up, the expected behavior coming, which
    is node2 restart, node1 stayed running.
    - when node1 cable was pulled up, the node2 restart and node1 stayed
    running, we expect that the node1 should restart and node2 stayed
    running.

    it seem that node1 become a master node (clusterware layer).

    is this expected behavior or is there another configuration missed?



    --
    regards
    ujang
    --
    http://www.freelists.org/webpage/oracle-l

    --
    http://www.freelists.org/webpage/oracle-l
  • Ujang Jaenudin at Jan 18, 2008 at 11:26 pm
    jeremy & cheng,

    yup that behavior is expected, after I read oracle presentation about
    rac internal.
    btw how if we suddenly shutdown node1 by take the server's power switch off ??

    how it works?
    how long node2 will make itself as master node (clusteware layer)??
    as far as I know, the disktimeout is 200s and network heartbeat
    timeout around 60s...

    regards
    ujang
    On Jan 19, 2008 2:18 AM, LS Cheng wrote:
    hi

    its expected behaviour, when the cluster is divided into sub clusters with
    same number of nodes, i.e one node per sub cluster in a two node cluster
    configuration, the lower node always survives

    thanks

    --
    LSC


    On Jan 17, 2008 1:41 PM, Ujang Jaenudin wrote:



    all,

    I have a RAC environment which consists of 2 nodes (linux x86_64), 10.2.0.3
    voting disk & OCR are on raw devices.
    but all datafiles are on OCFS2

    we have scenario for UAT which pulled up the cable (public & private)
    from one node.
    - when node2 cable was pulled up, the expected behavior coming, which
    is node2 restart, node1 stayed running.
    - when node1 cable was pulled up, the node2 restart and node1 stayed
    running, we expect that the node1 should restart and node2 stayed
    running.

    it seem that node1 become a master node (clusterware layer).

    is this expected behavior or is there another configuration missed?



    --
    regards
    ujang

    --
    http://www.freelists.org/webpage/oracle-l

    --
    regards
    ujang

    "I believe that exchange rate volatility is a major threat to
    prosperity in the world today"
    Dr. Robert A. Mundell, Nobel Laureate 1999
    --
    http://www.freelists.org/webpage/oracle-l

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouporacle-l @
categoriesoracle
postedJan 17, '08 at 12:41p
activeJan 18, '08 at 11:26p
posts4
users3
websiteoracle.com

People

Translate

site design / logo © 2023 Grokbase