The below two commands are generally used to check the status of
CRS. The first command lists the status of CRS on the local node where as the
other command shows the CRS status across all the nodes in Cluster.
crsctl check crs
<<-- for the local node
crsctl check
cluster <<-- for remote nodes in the cluster
[root@node1-pub ~]#
crsctl check crs
Cluster
Synchronization Services appears healthy
Cluster Ready
Services appears healthy
Event Manager
appears healthy
[root@node1-pub ~]#
For the below command to run, CSS needs to be running on the
local node. The "ONLINE" status for remote node says that CSS is
running on that node. When CSS is down on the remote node, the status of
"OFFLINE" is displayed for that node.
[root@node1-pub ~]#
crsctl check cluster
node1-pub ONLINE
node2-pub ONLINE
I use below command to get the name of Cluster. The similar
information can be retrieved from the dump file.
ocrdump -stdout
-keyname SYSTEM | grep -A 1 clustername | grep ORATEXT | awk '{print $3}'
OR
ocrconfig -export
/tmp/ocr_exp.dat -s online
for i in `strings
/tmp/ocr_exp.dat | grep -A 1 clustername` ; do if [ $i !=
'SYSTEM.css.clustername' ]; then echo $i; fi; done
OR
Oracle creates a directory with the same name as Cluster under
the $ORA_CRS_HOME/cdata.
The below command can be used to find out the number of nodes
registered into the cluster. It also displays the node's Public name, Private
name and Virtual name along with their numbers.
olsnodes -n -p -i
[root@node1-pub ~]#
olsnodes -n -p -i
node1-pub 1 node1-prv node1-vip
node2-pub 2 node2-prv node2-vip
The below command is used to view the no. of Voting disks
configured in the Cluster.
crsctl query css
votedisk
The ocrcheck command displays the no. of OCR files configured in the
Cluster. It is primarily used to chck the integrity of the OCR files. It also
displays the version of OCR as well as storage space information. You can only
have 2 OCR files at max.
[root@node1-pub ~]#
ocrcheck
Status of Oracle
Cluster Registry is as follows :
Version : 2
Total
space
(kbytes) : 262120
Used
space
(kbytes) : 3848
Available
space (kbytes) : 258272
ID : 744414276
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_0
Device/File
integrity check succeeded
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_1
Device/File
integrity check succeeded
Cluster
registry integrity check succeeded
Disktimeout: Disk Latencies in seconds from node-to-Votedisk.
Default Value is 200. (Disk IO)
Misscount: Network Latencies in
second from node-to-node (Interconnect). Default Value is 60 Sec (Linux) and 30
Sec in Unix platform. (Network IO) Misscount < Disktimeout
IF
(Disk IO Time > Disktimeout) OR (Network IO time
> Misscount)
THEN
REBOOT NODE
ELSE
DO NOT REBOOT
END IF;
crsctl get css
disktimeout
crsctl get css
misscount
crsctl get
css reboottime
[root@node1-pub ~]#
crsctl get css disktimeout
200
[root@node1-pub ~]#
crsctl get css misscount
Configuration parameter
misscount is not defined.
The above message indicates that the Misscount is not set
manually and it is set to its default Value which is 60 seconds on Linux. It
can be changed as below.
[root@node1-pub ~]#
crsctl set css misscount 100
Configuration
parameter misscount is now set to 100.
[root@node1-pub ~]#
crsctl get css misscount
100
The below command sets the value of misscount back to its
default value.
crsctl unset css
misscount
[root@node1-pub ~]#
crsctl unset css misscount
[root@node1-pub ~]#
crsctl get css reboottime
Removing OCR File
(1) Get the Existing OCR file information by running ocrcheck
utility.
[root@node1-pub ~]#
ocrcheck
Status of Oracle
Cluster Registry is as follows :
Version : 2
Total
space
(kbytes) : 262120
Used
space
(kbytes) : 3852
Available
space (kbytes) : 258268
ID : 744414276
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_0 <-- OCR
Device/File
integrity check succeeded
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_1 <-- OCR Mirror
Device/File
integrity check succeeded
Cluster
registry integrity check succeeded
(2) The First command removes the OCR mirror
(/u02/ocfs2/ocr/OCRfile_1). If you want to remove the OCR file
(/u02/ocfs2/ocr/OCRfile_1) run the next command.
ocrconfig -replace
ocrmirror
ocrconfig -replace
ocr
[root@node1-pub ~]#
ocrconfig -replace ocrmirror
[root@node1-pub ~]#
ocrcheck
Status of Oracle
Cluster Registry is as follows :
Version : 2
Total
space
(kbytes) : 262120
Used
space
(kbytes) : 3852
Available
space (kbytes) : 258268
ID : 744414276
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_0 <<-- OCR File
Device/File
integrity check succeeded
Device/File
not configured <-- OCR Mirror not existed any more
Cluster
registry integrity check succeeded
Adding OCR
You need to add OCR or OCR mirror file in a case where you want
to move the existing OCR file location to the different devices. The below
command add the OCR mirror file if OCR file already exists.
(1) Get the Current status of OCR:
[root@node1-pub ~]#
ocrconfig -replace ocrmirror
[root@node1-pub ~]#
ocrcheck
Status of Oracle
Cluster Registry is as follows :
Version : 2
Total
space
(kbytes) : 262120
Used
space
(kbytes) : 3852
Available
space (kbytes) : 258268
ID : 744414276
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_0 <<-- OCR File
Device/File
integrity check succeeded
Device/File
not configured <-- OCR Mirror does not exist
Cluster
registry integrity check succeeded
As it can be seen, there is only one OCR file but not the second
file (OCR Mirror). Below command adds the second OCR file.
ocrconfig -replace
ocrmirror <File name>
[root@node1-pub ~]#
ocrconfig -replace ocrmirror /u02/ocfs2/ocr/OCRfile_1
[root@node1-pub ~]#
ocrcheck
Status of Oracle
Cluster Registry is as follows :
Version : 2
Total
space
(kbytes) : 262120
Used
space
(kbytes) : 3852
Available
space (kbytes) : 258268
ID : 744414276
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_0
Device/File
integrity check succeeded
Device/File
Name :
/u02/ocfs2/ocr/OCRfile_1
Device/File
integrity check succeeded
Cluster
registry integrity check succeeded
You can have at most 2 OCR devices (OCR itself and its single
Mirror) in a cluster. Adding extra Mirror gives you below error message
[root@node1-pub ~]#
ocrconfig -replace ocrmirror /u02/ocfs2/ocr/OCRfile_2
PROT-21: Invalid
parameter
[root@node1-pub ~]#
Add/Remove Votedisk
file in Cluster:
Adding Votedisk:
Get the existing Vote Disks associated into the cluster. To be
safe, Bring crs cluster stack down on all the nodes but one on which you are
going to add votedisk from.
(1) Stop CRS on all the nodes in cluster
but one.
[root@node2-pub ~]#
crsctl stop crs
(2) Get the list of Existing Vote Disks
crsctl query css
votedisk
[root@node1-pub ~]#
crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
Located 3 voting
disk(s).
Backup the existing votedisks as below as oracle:
dd
if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0
[root@node1-pub ~]#
su - oracle
[oracle@node1-pub
~]$ dd if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0
41024+0 records in
41024+0 records out
[oracle@node1-pub
~]$
(4) Add an Extra Votedisk into the
Cluster:
If it is a OCFS, then touch the file as oracle. On
raw devices, initialize the raw devices using "dd" command
touch
/u02/ocfs2/vote/VDFile_3 <<-- as oracle
crsctl add css
votedisk /u02/ocfs2/vote/VDFile_3 <<-- as oracle
crsctl query css
votedisks
[root@node1-pub ~]#
su - oracle
[oracle@node1-pub
~]$ touch /u02/ocfs2/vote/VDFile_3
[oracle@node1-pub
~]$ crsctl add css votedisk /u02/ocfs2/vote/VDFile_3
Now formatting
voting disk: /u02/ocfs2/vote/VDFile_3.
Successful addition
of voting disk /u02/ocfs2/vote/VDFile_3.
(5) Confirm that the file has been
added successfully:
[root@node1-pub ~]#
ls -l /u02/ocfs2/vote/VDFile_3
-rw-r----- 1
oracle oinstall 21004288 Oct 6 16:31 /u02/ocfs2/vote/VDFile_3
[root@node1-pub ~]#
crsctl query css votedisks
Unknown parameter:
votedisks
[root@node1-pub ~]#
crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
3. 0 /u02/ocfs2/vote/VDFile_3
Located 4 voting
disk(s).
Removing Votedisk:
Removing Votedisk from the cluster is very simple. The below
command removes the given votedisk from cluster configuration.
crsctl delete css
votedisk /u02/ocfs2/vote/VDFile_3
[root@node1-pub ~]#
crsctl delete css votedisk /u02/ocfs2/vote/VDFile_3
Successful deletion
of voting disk /u02/ocfs2/vote/VDFile_3.
[root@node1-pub ~]#
[root@node1-pub ~]#
crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
Located 3 voting disk(s).
[root@node1-pub ~]#
Oracle performs physical backup of OCR devices every 4 hours
under the default backup directory $ORA_CRS_HOME/cdata/<CLUSTER_NAME> and
then it rolls that forward to Daily, weekly and monthly backup. You can get the
backup information by executing below command.
ocrconfig
-showbackup
[root@node1-pub ~]#
ocrconfig -showbackup
node2-pub 2007/09/03
17:46:47 /u01/app/crs/cdata/test-crs/backup00.ocr
node2-pub 2007/09/03
13:46:45 /u01/app/crs/cdata/test-crs/backup01.ocr
node2-pub 2007/09/03
09:46:44 /u01/app/crs/cdata/test-crs/backup02.ocr
node2-pub 2007/09/03
01:46:39 /u01/app/crs/cdata/test-crs/day.ocr
node2-pub 2007/09/03
01:46:39 /u01/app/crs/cdata/test-crs/week.ocr
[root@node1-pub ~]#
Manually backing up the OCR
ocrconfig
-manualbackup <<--Physical Backup of OCR
The above command backs up OCR under the default Backup
directory. You can export the contents of the OCR using below command (Logical
backup).
ocrconfig -export
/tmp/ocr_exp.dat -s online <<-- Logical
Backup of OCR
The below command is used to restore the OCR from the physical
backup. Shutdown CRS on all nodes.
ocrconfig -restore
<file name>
Locate the available Backups
[root@node1-pub ~]#
ocrconfig -showbackup
node2-pub 2007/09/03
17:46:47 /u01/app/crs/cdata/test-crs/backup00.ocr
node2-pub 2007/09/03
13:46:45 /u01/app/crs/cdata/test-crs/backup01.ocr
node2-pub 2007/09/03
09:46:44 /u01/app/crs/cdata/test-crs/backup02.ocr
node2-pub 2007/09/03
01:46:39 /u01/app/crs/cdata/test-crs/day.ocr
node2-pub 2007/09/03
01:46:39 /u01/app/crs/cdata/test-crs/week.ocr
node1-pub 2007/10/07
13:50:41 /u01/app/crs/cdata/test-crs/backup_20071007_135041.ocr
Perform Restore from previous Backup
[root@node2-pub ~]#
ocrconfig -restore /u01/app/crs/cdata/test-crs/week.ocr
The logical backup of OCR (taken using export option) can be
imported using the below command.
ocrconfig -import
/tmp/ocr_exp.dat
· Shutdown CRS on all the nodes in Cluster.
· Locate the current location of the Votedisks
· Restore each of the votedisks using "dd" command from
the previous good backup of Votedisk taken using the same "dd"
command.
· Start CRS on all the nodes.
crsctl stop crs
crsctl query css
votedisk
dd if=<backup of
Votedisk> of=<Votedisk file> <<-- do this for all the votedisks
crsctl start crs
Current
Config Changed
to
Node 1:
Public
IP: 216.160.37.154 192.168.10.11
VIP: 216.160.37.153 192.168.10.111
subnet: 216.160.37.159 192.168.10.0
Netmask: 255.255.255.248 255.255.255.0
Interface
used: eth0 eth0
Hostname: node1-pub.hingu.net node1-pub.hingu.net
Node 2:
Public
IP: 216.160.37.156 192.168.10.22
VIP: 216.160.37.157 192.168.10.222
subnet: 216.160.37.159 192.168.10.0
Netmask: 255.255.255.248 255.255.255.0
Interface
used: eth0 eth0
Hostname: node1-pub.hingu.net node2-pub.hingu.net
(A) Take the Services, Database, ASM Instances
and nodeapps down on both the Nodes in Cluster. Also disable the nodeapps, asm
and database instances to prevent them from restarting in case if this node
gets rebooted during this process.
srvctl stop service
-d test
srvctl stop
database -d test
srvctl stop asm -n
node1-pub
srvctl stop asm -n
node2-pub
srvctl stop
nodeapps -n node1-pub,node1-pub2
srvctl disable
instance -d test -i test1,test2
srvctl disable asm
-n node1-pub
srvctl disable asm
-n node2-pub
srvctl disable
nodeapps -n node1-pub
srvctl disable
nodeapps -n node2-pub
(B) Modify the /etc/hosts and/or DNS,
ifcfg-eth0 (local node) with the new IP values on All the Nodes
(C) Restart the specific network interface in
order to use the new IP.
ifconfig eth0 down
ifconfig eth0 up
Or, you can restart the network. CAUTION: on NAS, restarting
entire network may cause the node to be rebooted.
(D) Update the OCR with the New Public IP
information.
In case of
public IP, you have to delete the interface first and then add it back with the
new IP address. As oracle user, Issue the below command:
oifcfg delif
-global eth0
oifcfg setif
-global eth0/192.168.10.0:public
(E) Update the OCR with the New Virtual
IP.
Virtual IP is
part of the nodeapps and so you can modify the nodeapps to update the Virtual
IP information. As privileged user (root), Issue the below commands:
srvctl modify
nodeapps -n node1-pub -A 192.168.10.111/255.255.255.0/eth0 <-- for Node 1
srvctl modify
nodeapps -n node1-pub -A 192.168.10.222/255.255.255.0/eth0 <-- for Node 2
(F) Enable the nodeapps, ASM, database
Instances for all the Nodes.
srvctl enable
instance -d test -i test1,test2
srvctl enable asm
-n node1-pub
srvctl enable asm
-n node2-pub
srvctl enable
nodeapps -n node1-pub
srvctl enable
nodeapps -n node2-pub
(G) Update the listener.ora file on each nodes with
the correct IP addresses in case if it uses the IP address instead of the
hostname.
(H) Restart the Nodeapps, ASM and
Database instance
srvctl start
nodeapps -n node1-pub
srvctl start
nodeapps -n node2-pub
srvctl start asm -n
node1-pub
srvctl start asm -n
node2-pub
srvctl start
database -d test
srvctl
checkup:
Dear
Readers,
I
have given below some of the important health checks that can be done using
SRVCTL (Server Control) in 11gR2 RAC.
[oracle@rac2
~]$ srvctl status database -d RAC
Instance
RAC1 is running on node rac1
Instance
RAC2 is running on node rac2
[oracle@rac2
~]$ srvctl status listener
Listener
LISTENER is enabled
Listener
LISTENER is running on node(s): rac1,rac2
[oracle@rac2
~]$ srvctl status vip -i RAC1
VIP
rac1-vip is enabled
VIP
rac1-vip is running on node: rac1
[oracle@rac2
~]$ srvctl status vip -i RAC2
VIP
rac2-vip is enabled
VIP
rac2-vip is running on node: rac2
[oracle@rac2
~]$ srvctl status scan -n rac1,rac2
PRKO-2002
: Invalid command line option: -n
[oracle@rac2
~]$ srvctl status scan
SCAN
VIP scan1 is enabled
SCAN
VIP scan1 is running on node rac2
SCAN
VIP scan2 is enabled
SCAN
VIP scan2 is running on node rac1
SCAN
VIP scan3 is enabled
SCAN
VIP scan3 is running on node rac1
[oracle@rac2
~]$ srvctl status scan_listener
SCAN
Listener LISTENER_SCAN1 is enabled
SCAN
listener LISTENER_SCAN1 is running on node rac2
SCAN
Listener LISTENER_SCAN2 is enabled
SCAN
listener LISTENER_SCAN2 is running on node rac1
SCAN
Listener LISTENER_SCAN3 is enabled
SCAN
listener LISTENER_SCAN3 is running on node rac1
[oracle@rac2
~]$ srvctl status nodeapps
VIP
rac1-vip is enabled
VIP
rac1-vip is running on node: rac1
VIP
rac2-vip is enabled
VIP
rac2-vip is running on node: rac2
Network
is enabled
Network
is running on node: rac1
Network
is running on node: rac2
GSD
is disabled
GSD
is not running on node: rac1
GSD
is not running on node: rac2
ONS
is enabled
ONS
daemon is running on node: rac1
ONS
daemon is running on node: rac2
eONS
is enabled
eONS
daemon is running on node: rac1
eONS
daemon is running on node: rac2
[oracle@rac2
~]$ srvctl status server -n rac1,rac2
Server
name: rac1
Server
state: ONLINE
Server
name: rac2
Server
state: ONLINE
[oracle@rac2
~]$ srvctl config database -d RAC
Database
unique name: RAC
Database
name: RAC
Oracle
home: /u01/app/oracle/product/11.2.0/db_1
Oracle
user: oracle
Spfile:
+RAC_DATADG/RAC/spfileRAC.ora
Domain:
xxxxx
Start
options: open
Stop
options: immediate
Database
role: PRIMARY
Management
policy: AUTOMATIC
Server
pools: RAC
Database
instances: RAC1,RAC2
Disk
Groups: RAC_DATADG,RAC_RECODG
Services:
Database
is administrator managed
[oracle@rac2
~]$ srvctl status diskgroup -g RAC_DATADG
Disk
Group RAC_DATADG is running on rac1,rac2
cheers
–
Vivek
Nice Blog..Yes it's true, regular Health checkups can help to find the problems before they start.Thanks for sharing wonderful information.
ReplyDeleteHealth checkup Packages