FAQ
Hello Riyaj

I'm re-installing the Operating System of machines and tomorrow I'll
re-install the Oracle RAC (with default settings I'll check the crsd logs)
and try tuning this time.

Thanks again.

PS: In generally, what the time between stop the first node and the second
node up the first VIP interface ?!

Good Night All.
Waldirio

2008/6/12 Riyaj Shamsudeen :
Hello Waldirio
Breaking up crsd.log, Approximately 30 seconds spent on CLSC recv/send
failure etc. Parameter css misscount is set to 30 in unix platforms. I would
say, misscount is controlling this duration, but that need to be validated
enabling further trace and looking at cssd.log etc.., if you want.

2008-06-12 14:19:15.781: [ OCRMSG][1484962144]prom_rpc: CLSC recv
failure..ret code 7
2008-06-12 14:19:42.464: [ OCRMSG][1484962144]prom_rpc: CLSC send
failure..ret code 6

Another 26 seconds spent in Cluster reconfiguration below..

2008-06-12 14:19:46.036: [ OCRSRV][2541411904]proath_init: Failed to
retrieve pubdata. Expect a rcfg
2008-06-12 14:20:12.283: [ OCRMAS][1210108256]th_master:12: I AM THE NEW
OCR MASTER at incar 1. Node Number 1

Changing these parameters have profound effect on availability especially
if the network architecture is not good enough.

Cheers
Riyaj Shamsudeen
The Pythian Group www.pythian.com <http://www.pythian.com/>
Personal blog: orainternals.wordpress.com <
http://orainternals.wordpress.com/>

Waldirio Manhães Pinheiro wrote:
Hello Friend
Thank you for answer .., let's check.
2008/6/12, Riyaj Shamsudeen
riyaj.shamsudeen_at_gmail.com>>:

Hello Waldirio
the time to the first machine detect the second machine
powered off is very big (between 1 and 2 min),
How are you measuring this time? Are you checking alert log or
are you using DB connections to check it?

I was check this time starting when I have been send the shutdown to
server until the second VIP interface up on second node (backup node).

Can you also send crsd.log?

Ok, following the address because the size ...
http://rafb.net/p/hqE13995.html
When I send the power off on first node, on second node (crsd log on link
above), on line 1 log the message "[ COMMCRS][1147169120]clsc_receive:
(0xc6d180) Error receiving, ns (12535, 12560), transport (505, 110, 0)" and
still "Connection not active" until line 2045.
PS: Now, my VIP address of first node don't migrated to second node later
power off ... (maybe will be necessary re-install the OS and Oracle
ClusterWare, because I've changed the system a lot of to test)

Further, refer $CRS_HOME/bin/racgvip and there are few parameters
such as check interval, restart attempts etc controlling behavior
of VIP failover too. Not sure, they are applicable when machine is
rebooted since heartbeat will fail before vip check..

Yes, I checked this file too, but don't changed.
Now, looking the crsd log file, I believe the Oracle know when another
node is out, but who is responsible to make a failover (mount the aliases of
VIP on another machine) !? (Script, Daemon, Angel :P )
Thank you friends for help.
Waldirio

Cheers
Riyaj Shamsudeen
The Pythian Group www.pythian.com <http://www.pythian.com/>
Personal blog: orainternals.wordpress.com
<http://orainternals.wordpress.com/>

Waldirio Manhães Pinheiro wrote:

Hello Friends
I'd like to ask about Oracle RAC in Linux environment. I
installed two machine with RedHat AS 4Up5 and Oracle 10.2.0.3
<http://10.2.0.3/> <http://10.2.0.3/> with ClusterWare. The

installation finish with successful and the data base work fine.
I checked my environment of availability with the test below:
Station cambeba UP
Station cangua UP
# crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....BA.lsnr application ONLINE ONLINE cambeba
ora....eba.gsd application ONLINE ONLINE cambeba
ora....eba.ons application ONLINE ONLINE cambeba
ora....eba.vip application ONLINE ONLINE cambeba
ora....UA.lsnr application ONLINE ONLINE cangua
ora.cangua.gsd application ONLINE ONLINE cangua
ora.cangua.ons application ONLINE ONLINE cangua
ora.cangua.vip application ONLINE ONLINE cangua
ora.ora10gq.db application ONLINE ONLINE cangua
ora....q1.inst application ONLINE ONLINE cangua
ora....q2.inst application ONLINE ONLINE cambeba
At this point, that's ok, but when I force a power off in
cangua or cambeba (the name of my machines), the time to the
firt machine detect the second machine powered off is very big
(between 1 and 2 min), so, if my client was working, will lost
the query for time out.
I changed the configurations in objects ora.cambeba.vip and
ora.cangua.vip, but without successful.
Any Ideia to fix this problem (decrease the time of check
between nodes on cluster) ?!?!
PS: I checked in list database, but without successful about
this problem

Thanks in advanced.
-- ______________
Atenciosamente
Waldirio
msn: wmp_at_sinope.com.br
Site: www.waldirio.com.br <http://www.waldirio.com.br/>
<http://www.waldirio.com.br/>
Blog: blog.waldirio.com.br <http://blog.waldirio.com.br/>
<http://blog.waldirio.com.br/>
PGP: www.waldirio.com.br/public.html
<http://www.waldirio.com.br/public.html>
<http://www.waldirio.com.br/public.html>





--
______________
Atenciosamente
Waldirio
msn: wmp_at_sinope.com.br
Site: www.waldirio.com.br <http://www.waldirio.com.br>
Blog: blog.waldirio.com.br <http://blog.waldirio.com.br>
PGP: www.waldirio.com.br/public.html <
http://www.waldirio.com.br/public.html>
--
______________
Atenciosamente
Waldirio
msn: wmp_at_sinope.com.br
Site: www.waldirio.com.br
Blog: blog.waldirio.com.br
PGP: www.waldirio.com.br/public.html

--
http://www.freelists.org/webpage/oracle-l

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 6 of 6 | next ›
Discussion Overview
grouporacle-l @
categoriesoracle
postedJun 12, '08 at 2:01p
activeJun 13, '08 at 12:22a
posts6
users3
websiteoracle.com

People

Translate

site design / logo © 2022 Grokbase