FAQ
Hi

I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

One of tests was kill ASM and see how does that affects Clusterware
operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
Group). After killing ASM nothing happened as it was quicky started up
again. So far so good. The next test was same test but changing the ASM
Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
Group obviously did not get mounted but then the Cluster was working without
any problems.

So how is this happening? Doesnt Clusterware need to write and read to
Voting Disk every second? I was expecting a Clusterware failure in the node
but everything worked just as everything were ok.

Thanks!

Search Discussions

  • K Gopalakrishnan at Jan 21, 2010 at 12:38 pm
    Clusterware failure will happen _only_ when it can not acess the
    physical devices (disk timeout in css) and shutting down ASM does not
    revoke the access to disks. In your case clusterware _knows_ the
    location of ocr/voting information in ASM disks and it can continue
    reading/writing even ASM instance is down.

    -Gopal
    On Thu, Jan 21, 2010 at 2:51 AM, LS Cheng wrote:
    Hi

    I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

    One of tests was kill ASM and see how does that affects Clusterware
    operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
    Group). After killing ASM nothing happened as it was quicky started up
    again. So far so good. The next test was same test but changing the ASM
    Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
    Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
    Group obviously did not get mounted but then the Cluster was working without
    any problems.

    So how is this happening? Doesnt Clusterware need to write and read to
    Voting Disk every second? I was expecting a Clusterware failure in the node
    but everything worked just as everything were ok.

    Thanks!

    --
    LSC

    --
    http://www.freelists.org/webpage/oracle-l
  • LS Cheng at Jan 21, 2010 at 12:44 pm
    Hi

    So even OCRDATA Disk Group is not mounted and the physical disks has
    root.root instead of grid.oinstall ownership Clusterware will be up and
    running? So basically you mean Clusterware does not need ASM to be up to
    access the OCRDATA disks?

    My test was

    kill ASM
    change asm disks (OCRDATA) from grid.oinstall to root.root
    check clusterware status which was up and running

    Thanks
    On Thu, Jan 21, 2010 at 1:38 PM, K Gopalakrishnan wrote:

    Clusterware failure will happen _only_ when it can not acess the
    physical devices (disk timeout in css) and shutting down ASM does not
    revoke the access to disks. In your case clusterware _knows_ the
    location of ocr/voting information in ASM disks and it can continue
    reading/writing even ASM instance is down.

    -Gopal



    On Thu, Jan 21, 2010 at 2:51 AM, LS Cheng wrote:
    Hi

    I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

    One of tests was kill ASM and see how does that affects Clusterware
    operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
    Group). After killing ASM nothing happened as it was quicky started up
    again. So far so good. The next test was same test but changing the ASM
    Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
    Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
    Group obviously did not get mounted but then the Cluster was working without
    any problems.

    So how is this happening? Doesnt Clusterware need to write and read to
    Voting Disk every second? I was expecting a Clusterware failure in the node
    but everything worked just as everything were ok.

    Thanks!

    --
    LSC

    --
    http://www.freelists.org/webpage/oracle-l
  • Bobak, Mark at Jan 21, 2010 at 1:36 pm
    Yep, makes sense, I think.

    Clusterware starts, ASM serves up OCR and voting disk geometry, as it relates to raw devices that make up your OCRDATA diskgroup. Clusterware caches that info, no longer needs to talk to ASM for it.

    You do the damage, including changing ownership of devices that make up OCRDATA diskgroup to root:root. But, clusterware processes run as root, so, they can still read/write those raw devices.

    What happens if you chown the devices to root:root, then also chmod 000 all those devices?

    -Mark

    From: oracle-l-bounce_at_freelists.org On Behalf Of LS Cheng
    Sent: Thursday, January 21, 2010 7:44 AM
    To: K Gopalakrishnan
    Cc: Oracle Mailinglist
    Subject: Re: Anyone tried kill ASM in 11gR2 RAC?

    Hi

    So even OCRDATA Disk Group is not mounted and the physical disks has root.root instead of grid.oinstall ownership Clusterware will be up and running? So basically you mean Clusterware does not need ASM to be up to access the OCRDATA disks?

    My test was

    kill ASM
    change asm disks (OCRDATA) from grid.oinstall to root.root
    check clusterware status which was up and running

    Thanks
    On Thu, Jan 21, 2010 at 1:38 PM, K Gopalakrishnan > wrote:
    Clusterware failure will happen _only_ when it can not acess the
    physical devices (disk timeout in css) and shutting down ASM does not
    revoke the access to disks. In your case clusterware _knows_ the
    location of ocr/voting information in ASM disks and it can continue
    reading/writing even ASM instance is down.

    -Gopal
    On Thu, Jan 21, 2010 at 2:51 AM, LS Cheng > wrote:
    Hi

    I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

    One of tests was kill ASM and see how does that affects Clusterware
    operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
    Group). After killing ASM nothing happened as it was quicky started up
    again. So far so good. The next test was same test but changing the ASM
    Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
    Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
    Group obviously did not get mounted but then the Cluster was working without
    any problems.

    So how is this happening? Doesnt Clusterware need to write and read to
    Voting Disk every second? I was expecting a Clusterware failure in the node
    but everything worked just as everything were ok.

    Thanks!

    --
    LSC

    --
    http://www.freelists.org/webpage/oracle-l
  • LS Cheng at Jan 21, 2010 at 5:53 pm
    That sounds ok but that is that written anywhere, or anywhere which states
    Clusterware doesnt really need ASM instances up to access the disks except
    it needs it to get the disk information...

    Thanks!

    --
    LSC
    On Thu, Jan 21, 2010 at 2:36 PM, Bobak, Mark wrote:

    Yep, makes sense, I think.



    Clusterware starts, ASM serves up OCR and voting disk geometry, as it
    relates to raw devices that make up your OCRDATA diskgroup. Clusterware
    caches that info, no longer needs to talk to ASM for it.



    You do the damage, including changing ownership of devices that make up
    OCRDATA diskgroup to root:root. But, clusterware processes run as root, so,
    they can still read/write those raw devices.



    What happens if you chown the devices to root:root, then also chmod 000 all
    those devices?



    -Mark



    *From:* oracle-l-bounce_at_freelists.org [mailto:
    oracle-l-bounce@freelists.org] *On Behalf Of *LS Cheng
    *Sent:* Thursday, January 21, 2010 7:44 AM
    *To:* K Gopalakrishnan
    *Cc:* Oracle Mailinglist
    *Subject:* Re: Anyone tried kill ASM in 11gR2 RAC?



    Hi

    So even OCRDATA Disk Group is not mounted and the physical disks has
    root.root instead of grid.oinstall ownership Clusterware will be up and
    running? So basically you mean Clusterware does not need ASM to be up to
    access the OCRDATA disks?

    My test was

    - kill ASM
    - change asm disks (OCRDATA) from grid.oinstall to root.root
    - check clusterware status which was up and running





    Thanks

    On Thu, Jan 21, 2010 at 1:38 PM, K Gopalakrishnan
    wrote:

    Clusterware failure will happen _only_ when it can not acess the
    physical devices (disk timeout in css) and shutting down ASM does not
    revoke the access to disks. In your case clusterware _knows_ the
    location of ocr/voting information in ASM disks and it can continue
    reading/writing even ASM instance is down.

    -Gopal




    On Thu, Jan 21, 2010 at 2:51 AM, LS Cheng wrote:
    Hi

    I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

    One of tests was kill ASM and see how does that affects Clusterware
    operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
    Group). After killing ASM nothing happened as it was quicky started up
    again. So far so good. The next test was same test but changing the ASM
    Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
    Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
    Group obviously did not get mounted but then the Cluster was working without
    any problems.

    So how is this happening? Doesnt Clusterware need to write and read to
    Voting Disk every second? I was expecting a Clusterware failure in the node
    but everything worked just as everything were ok.

    Thanks!

    --
    LSC
    --
    http://www.freelists.org/webpage/oracle-l
  • LS Cheng at Jan 23, 2010 at 8:43 am
    Hi

    I did further tests, previously the results I posted was a test on AIX 6.1
    and 11gR2, I have tested now with Linux x86-64

    kill asm pmon
    chown root.root asmdisk

    crsd process dies because it cannot access OCR (Disk Group not mounted):

    2010-01-23 09:38:59.944: [ OCRASM][801350224]proprasmo: The ASM disk group
    OCR is not found or not mounted
    2010-01-23 09:38:59.944: [ OCRRAW][801350224]proprioo: Failed to open
    [+OCR]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
    2010-01-23 09:38:59.944: [ OCRRAW][801350224]proprioo: No OCR/OLR devices
    are usable
    2010-01-23 09:38:59.944: [ OCRASM][801350224]proprasmcl: asmhandle is NULL
    2010-01-23 09:38:59.944: [ OCRRAW][801350224]proprinit: Could not open raw
    device

    2010-01-23 09:38:59.944: [ OCRASM][801350224]proprasmcl: asmhandle is NULL
    2010-01-23 09:38:59.945: [ OCRAPI][801350224]a_init:16!: Backend init
    unsuccessful : [26]
    2010-01-23 09:38:59.945: [ CRSOCR][801350224] OCR context init failure.

    Error: PROC-26: Error while accessing the physical storage ASM error [SLOS:
    cat=8, opn=kgfoOpenFile01, dep=15056, loc=kgfokge

    ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +OCR.255.4294967295
    ORA-17503: ksfdopn:2 Failed to open file +OCR.255.4294967295
    ORA-15001: diskgroup "OCR"

    ] [8]
    2010-01-23 09:38:59.945: [ CRSD][801350224][PANIC] CRSD exiting: Could
    not init OCR, code: 26
    2010-01-23 09:38:59.945: [ CRSD][801350224] Done.

    crsctl gives error:

    [root_at_grid1 ~]# /u01/grid/11.2.0/bin/crsctl stat res -t
    CRS-4535: Cannot communicate with Cluster Ready Services
    CRS-4000: Command Status failed, or completed with errors.

    however ASM and cssd is up and running

    So we have a complete different scenario, same test two different results in
    two operating system.

    Thanks!
    On Thu, Jan 21, 2010 at 2:36 PM, Bobak, Mark wrote:

    Yep, makes sense, I think.



    Clusterware starts, ASM serves up OCR and voting disk geometry, as it
    relates to raw devices that make up your OCRDATA diskgroup. Clusterware
    caches that info, no longer needs to talk to ASM for it.



    You do the damage, including changing ownership of devices that make up
    OCRDATA diskgroup to root:root. But, clusterware processes run as root, so,
    they can still read/write those raw devices.



    What happens if you chown the devices to root:root, then also chmod 000 all
    those devices?



    -Mark



    *From:* oracle-l-bounce_at_freelists.org [mailto:
    oracle-l-bounce@freelists.org] *On Behalf Of *LS Cheng
    *Sent:* Thursday, January 21, 2010 7:44 AM
    *To:* K Gopalakrishnan
    *Cc:* Oracle Mailinglist
    *Subject:* Re: Anyone tried kill ASM in 11gR2 RAC?



    Hi

    So even OCRDATA Disk Group is not mounted and the physical disks has
    root.root instead of grid.oinstall ownership Clusterware will be up and
    running? So basically you mean Clusterware does not need ASM to be up to
    access the OCRDATA disks?

    My test was

    - kill ASM
    - change asm disks (OCRDATA) from grid.oinstall to root.root
    - check clusterware status which was up and running





    Thanks

    On Thu, Jan 21, 2010 at 1:38 PM, K Gopalakrishnan
    wrote:

    Clusterware failure will happen _only_ when it can not acess the
    physical devices (disk timeout in css) and shutting down ASM does not
    revoke the access to disks. In your case clusterware _knows_ the
    location of ocr/voting information in ASM disks and it can continue
    reading/writing even ASM instance is down.

    -Gopal




    On Thu, Jan 21, 2010 at 2:51 AM, LS Cheng wrote:
    Hi

    I was doing some cluster destructive tests on RAC 11gR2 a few days ago.

    One of tests was kill ASM and see how does that affects Clusterware
    operation since OCR and Voting Disks are located in ASM (OCRDATA Disk
    Group). After killing ASM nothing happened as it was quicky started up
    again. So far so good. The next test was same test but changing the ASM
    Disks ownership so when ASM is restarted OCR Disk Group cannot be accessed.
    Surprisingly ASM Was started up, Database Disk Group was mounted OCR disk
    Group obviously did not get mounted but then the Cluster was working without
    any problems.

    So how is this happening? Doesnt Clusterware need to write and read to
    Voting Disk every second? I was expecting a Clusterware failure in the node
    but everything worked just as everything were ok.

    Thanks!

    --
    LSC
    --
    http://www.freelists.org/webpage/oracle-l

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouporacle-l @
categoriesoracle
postedJan 21, '10 at 8:51a
activeJan 23, '10 at 8:43a
posts6
users3
websiteoracle.com

People

Translate

site design / logo © 2022 Grokbase