Disclaimer

Monday, 13 September 2021

RESTORE OCR AND VOTEDISK - ORACLE RAC 12c

 1) Lets corrupt disk using below command:

[root@rico1 ~]# dd if=/dev/zero of=/dev/asm_grid bs=1M count=256
256 + 0 read records
256 + 0 saved records
copied 268 435 456 bytes (268 MB), 0.521837 s, 514 MB / s

2) Of course after this operation, the final state of the processes can look like this:
[root@rico1 ~]# crsctl stat res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                               STABLE
ora.crf
      1        ONLINE  OFFLINE                               STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  OFFLINE      rico1                    STARTING
ora.cssdmonitor
      1        ONLINE  ONLINE       rico1                    STABLE
ora.ctssd
      1        ONLINE  OFFLINE                               STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE rico1                    STABLE
ora.gipcd
      1        ONLINE  ONLINE       rico1                    STABLE
ora.gpnpd
      1        ONLINE  ONLINE       rico1                    STABLE
ora.mdnsd
      1        ONLINE  ONLINE       rico1                    STABLE
ora.storage
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------

3) The cssd service will not be able to start, because there are no voting disks:

[root@rico1 ~]# tail -10 /u01/app/oracle/diag/crs/rico1/crs/trace/ocssd.trc
2016-06-10 10:28:32.331227 :    CSSD:990865152: clssnmvDiskVerify: Successful discovery of 0 disks
2016-06-10 10:28:32.331229 :    CSSD:990865152: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2016-06-10 10:28:32.331231 :    CSSD:990865152: clssnmvFindInitialConfigs: No voting files found
2016-06-10 10:28:32.331302 :    CSSD:990865152: (:CSSNM00070:)clssnmCompleteInitVFDiscovery
                                                 : Voting file not found. Retrying discovery in 15 seconds
2016-06-10 10:28:33.270863 :    CSSD:1279616768: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2016-06-10 10:28:33.270876 :    CSSD:1279616768: clsssc_CLSFAInit_CB: clsfa fencing not ready yet
2016-06-10 10:28:34.271252 :    CSSD:1279616768: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization


4) OK, so let’s try to stop the cluster services:

[root@rico1 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rico1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rico1'
CRS-2677: Stop of 'ora.mdnsd' on 'rico1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rico1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rico1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rico1'
CRS-2677: Stop of 'ora.gipcd' on 'rico1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rico1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rico1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rico1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

5) Now we will have to start CRS in exclusive mode and start fixing stuff:
[root@rico1 ~]# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'rico1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rico1'
CRS-2676: Start of 'ora.mdnsd' on 'rico1' succeeded
CRS-2676: Start of 'ora.evmd' on 'rico1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rico1'
CRS-2676: Start of 'ora.gpnpd' on 'rico1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rico1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rico1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rico1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rico1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rico1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rico1'
CRS-2676: Start of 'ora.diskmon' on 'rico1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rico1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rico1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rico1'
CRS-2676: Start of 'ora.ctssd' on 'rico1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rico1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rico1'
CRS-2676: Start of 'ora.asm' on 'rico1' succeeded

6) Of course in this situation KFED will not be helpful 🙂
[root@rico1 ~]# kfed repair /dev/asm_grid
KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]

7) So now we have to recreate diskgroup GRID and ASM spfile and passwordfile:
SQL> alter system
  2  set asm_diskstring='/dev/asm*';

SQL> ed
Wrote file afiedt.buf

  1  alter system
  2* set asm_diskgroups='GRID','DATA'
SQL> /

System altered.

SQL> ;
  1* select path, header_status from v$asm_disk
SQL> /

PATH			       HEADER_STATU
------------------------------ ------------
/dev/asm_data		       MEMBER
/dev/asm_grid		       CANDIDATE

8) Let’s create back our GRID diskgroup:

SQL> ed
Wrote file afiedt.buf

  1  create diskgroup grid
  2  external redundancy
  3  disk '/dev/asm_grid'
  4  attribute 'compatible.asm'='12.1.0.2',
  5*		'compatible.rdbms'='12.1.0.2'
SQL> /

Diskgroup created.

9) Now we have to recreate SPFILE for ASM. First step will be creating a simple pfile:

[oracle@rico1 ~]$ cd $ORACLE_HOME/dbs
[oracle@rico1 dbs]$ vim init+ASM1.ora
[oracle@rico1 dbs]$ cat !$
cat init+ASM1.ora
*.asm_diskgroups='GRID'
*.asm_diskgroups='DATA'
*.asm_diskstring='/dev/asm*'

Next we can create spfile:
SQL> create spfile='+GRID' from pfile;

File created.

SQL> !rm init+ASM1.ora

10) And passwordfile:
[oracle@rico1 dbs]$ orapwd file=+GRID password=oracle asm=yes

11) We are now ready to restore the OCR file – remember to restore the newest one:

[root@rico1 ~]# ocrconfig -restore /u01/app/12.1.0/grid/cdata/rico-cluster/backup_20160610_101746.ocr
[root@rico1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
	 Version                  :          4
	 Total space (kbytes)     :     409568
	 Used space (kbytes)      :       1460
	 Available space (kbytes) :     408108
	 ID                       :  115130541
	 Device/File Name         :      +GRID
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

	 Cluster registry integrity check succeeded

	 Logical corruption check succeeded

12) Now we can create new voting disk:
[root@rico1 ~]# crsctl query css votedisk
Located 0 voting disk(s).
[root@rico1 ~]# crsctl replace votedisk +GRID
Successful addition of voting disk 62a6bea00e4e4f01bf3ed09c345eedba.
Successfully replaced voting disk group with +GRID.
CRS-4266: Voting file(s) successfully replaced
[root@rico1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   62a6bea00e4e4f01bf3ed09c345eedba (/dev/asm_grid) [GRID]
Located 1 voting disk(s).

13) So it seems, that everything looks fine. It’s time to stop CRS
[root@rico1 ~]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rico1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rico1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rico1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rico1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rico1'
CRS-2677: Stop of 'ora.evmd' on 'rico1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rico1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rico1'
CRS-2677: Stop of 'ora.mdnsd' on 'rico1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rico1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rico1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rico1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rico1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rico1'
CRS-2677: Stop of 'ora.cssd' on 'rico1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rico1'
CRS-2677: Stop of 'ora.gipcd' on 'rico1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rico1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

14) And start it in normal mode
[root@rico1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

15) And we’re done 🙂

[root@rico1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rico1                    STABLE
               ONLINE  ONLINE       rico2                    STABLE
ora.GRID.dg
               ONLINE  ONLINE       rico1                    STABLE
               ONLINE  ONLINE       rico2                    STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rico1                    STABLE
               ONLINE  ONLINE       rico2                    STABLE
ora.asm
               ONLINE  ONLINE       rico1                    Started,STABLE
               ONLINE  ONLINE       rico2                    Started,STABLE
ora.net1.network
               ONLINE  ONLINE       rico1                    STABLE
               ONLINE  ONLINE       rico2                    STABLE
ora.ons
               ONLINE  ONLINE       rico1                    STABLE
               ONLINE  ONLINE       rico2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rico2                    STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rico1                    STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rico1                    STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rico2                    169.254.26.98 10.0.0
                                                             .12,STABLE
ora.cvu
      1        ONLINE  ONLINE       rico1                    STABLE
ora.dupa.db
      1        ONLINE  ONLINE       rico1                    Open,STABLE
      2        ONLINE  ONLINE       rico2                    Open,STABLE
ora.mgmtdb
      1        ONLINE  OFFLINE                               STABLE
ora.oc4j
      1        ONLINE  ONLINE       rico1                    STABLE
ora.rico1.vip
      1        ONLINE  ONLINE       rico1                    STABLE
ora.rico2.vip
      1        ONLINE  ONLINE       rico2                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rico2                    STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rico1                    STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rico1                    STABLE
--------------------------------------------------------------------------------

16) The last step would be to recreate -MGMTDB:
[root@rico1 ~]# srvctl remove mgmtdb
Remove the database _mgmtdb? (y/[n]) y
[root@rico1 ~]# su - oracle
[oracle@rico1 ~]$ . oraenv
ORACLE_SID = [oracle] ? +ASM1
The Oracle base has been set to /u01/app/oracle
(reverse-i-search)`': ^C
[oracle@rico1 ~]$ export GI_HOME=$ORACLE_HOME

[oracle@rico1 ~]$ dbca -silent -createDatabase -sid -MGMTDB
-createAsContainerDatabase true -templateName MGMTSeed_Database.dbc
-gdbName _mgmtdb -storageType ASM -diskGroupName +grid
-datafileJarLocation $GI_HOME/assistants/dbca/templates
-characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheck

Registering database with Oracle Grid Infrastructure
5% complete
Copying database files
7% complete
9% complete
16% complete
23% complete
30% complete
41% complete
Creating and starting Oracle instance
43% complete
48% complete
49% complete
50% complete
55% complete
60% complete
61% complete
64% complete
Completing Database Creation
68% complete
79% complete
89% complete
100% complete
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/_mgmtdb3.log" 
for further details.

I hope you won’t have to use this procedure in real life 🙂


No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...