As you know, Oracle Grid Infrastructure needs several internal stuff to run. There are voting disks, the cluster registry (OCR) and since 12c there is a Grid Infrastructure Management Repository (GIMR). All of these are stored together in an ASM diskgroup, at least if you did not separated things later on.
But what happens if this diskgroup get lost? The cluster stack will not come up since some very fundamental ressources are missing now. I tested this case a couple of years ago when 11.2 came out and all the cluster stuff was moved into an ASM diskgroup. And I needed to test again with 12c because my freshly installed VM lost it’s shared disk that I used for the OCR diskgroup. So I was forced to recover my installation.
Be careful with the OS users you use to do things. Some steps need root privileges, other steps are executed as the Grid Infrastructure owner (oracle in my case). I included the prompt which reflects the user I used to perform the particular step.
1. Status
The cluster stack does not come up since there are no voting disks anymore.
Cluster alert.log shows:
1 | 2015-05-04 10:32:48.255 [OCSSD(31416)]CRS-1714: Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/oracle/diag/crs/oel6u4/crs/trace/ocssd.trc |
And the internal processes look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | [root@oel6u4 ~]# crsctl stat res -t -init--------------------------------------------------------------------------------Name Target State Server State details--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.asm 1 ONLINE OFFLINE STABLEora.cluster_interconnect.haip 1 ONLINE OFFLINE STABLEora.crf 1 ONLINE OFFLINE STABLEora.crsd 1 ONLINE OFFLINE STABLEora.cssd 1 ONLINE OFFLINE oel6u4 STARTINGora.cssdmonitor 1 ONLINE ONLINE oel6u4 STABLEora.ctssd 1 ONLINE OFFLINE STABLEora.diskmon 1 OFFLINE OFFLINE STABLEora.drivers.acfs 1 ONLINE ONLINE oel6u4 STABLEora.evmd 1 ONLINE INTERMEDIATE oel6u4 STABLEora.gipcd 1 ONLINE ONLINE oel6u4 STABLEora.gpnpd 1 ONLINE ONLINE oel6u4 STABLEora.mdnsd 1 ONLINE ONLINE oel6u4 STABLEora.storage 1 ONLINE OFFLINE STABLE-------------------------------------------------------------------------------- |
2. Recreate OCR diskgroup
At first I made my disk available again using ASMlib:
1 2 3 4 5 6 7 8 | [root@oel6u4 ~]# oracleasm listdisksDATA[root@oel6u4 ~]# oracleasm createdisk OCR /dev/sdc1Writing disk header: doneInstantiating disk: done[root@oel6u4 ~]# oracleasm listdisksDATAOCR |
Now I need to restore my ASM diskgroup, but I that requires a running ASM instance to do that. So stop the cluster stack and start again in exclusive mode. By the way, “crsctl stop crs -f” did not finish so I disabled the cluster stack by issuing “crsctl disable has” and reboot.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | [root@oel6u4 ~]# crsctl enable hasCRS-4622: Oracle High Availability Services autostart is enabled.[root@oel6u4 ~]# crsctl start crs -exclCRS-4123: Oracle High Availability Services has been started.CRS-2672: Attempting to start 'ora.evmd' on 'oel6u4'CRS-2672: Attempting to start 'ora.mdnsd' on 'oel6u4'CRS-2676: Start of 'ora.evmd' on 'oel6u4' succeededCRS-2676: Start of 'ora.mdnsd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.gpnpd' on 'oel6u4'CRS-2676: Start of 'ora.gpnpd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cssdmonitor' on 'oel6u4'CRS-2672: Attempting to start 'ora.gipcd' on 'oel6u4'CRS-2676: Start of 'ora.cssdmonitor' on 'oel6u4' succeededCRS-2676: Start of 'ora.gipcd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cssd' on 'oel6u4'CRS-2672: Attempting to start 'ora.diskmon' on 'oel6u4'CRS-2676: Start of 'ora.diskmon' on 'oel6u4' succeededCRS-2676: Start of 'ora.cssd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.crf' on 'oel6u4'CRS-2672: Attempting to start 'ora.ctssd' on 'oel6u4'CRS-2672: Attempting to start 'ora.drivers.acfs' on 'oel6u4'CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'oel6u4'CRS-2676: Start of 'ora.crf' on 'oel6u4' succeededCRS-2676: Start of 'ora.ctssd' on 'oel6u4' succeededCRS-2676: Start of 'ora.drivers.acfs' on 'oel6u4' succeededCRS-2676: Start of 'ora.cluster_interconnect.haip' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.asm' on 'oel6u4'CRS-2676: Start of 'ora.asm' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.storage' on 'oel6u4'diskgroup OCR not mounted ()CRS-5017: The resource action "ora.storage start" encountered the following error:Storage agent start action aborted. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/diag/crs/oel6u4/crs/trace/ohasd_orarootagent_root.trc".CRS-2674: Start of 'ora.storage' on 'oel6u4' failedCRS-2679: Attempting to clean 'ora.storage' on 'oel6u4'CRS-2681: Clean of 'ora.storage' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.asm' on 'oel6u4'CRS-2677: Stop of 'ora.asm' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'oel6u4'CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.drivers.acfs' on 'oel6u4'CRS-2677: Stop of 'ora.drivers.acfs' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.ctssd' on 'oel6u4'CRS-2677: Stop of 'ora.ctssd' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.crf' on 'oel6u4'CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeededCRS-4000: Command Start failed, or completed with errors. |
as you see the startup fails since “ora.storage” is not able to locate the OCR diskgroup. That means there is a timeframe of about 10 minutes to create the diskgroup during startup of “ora.storage”.
If I would have made a backup of my ASM diskgroup I could have used that. But I have not. That’s why I create my OCR diskgroup from scratch. Start the CRS again and then do the following from a second session:
1 2 3 4 5 6 7 8 9 10 11 | [root@oel6u4 ~]# cat ocr.dg<dg name="ocr" redundancy="external"> <dsk string="/dev/oracleasm/disks/OCR" quorum="QUORUM" /> <a name="compatible.asm" value="12.1.0.2.0" /> <a name="compatible.rdbms" value="12.1.0.2.0" /></dg>[root@oel6u4 ~]# asmcmd mkdg ~/ocr.dg[root@oel6u4 ~]# asmcmd lsdgState Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files NameMOUNTED EXTERN N 512 4096 1048576 12287 12234 0 12234 0 N OCR/ |
So far, so good. The OCR diskgroup is there and it is mounted. At this point the “ora.storage” manages to come up successfully.
3. Restore OCR
Next step is restoring the OCR from backup. Fortunately the clusterware creates backups of the OCR by itself right from the beginning.
1 2 3 4 5 6 7 8 9 10 11 | [root@oel6u4 ~]# ocrconfig -showbackupPROT-26: Oracle Cluster Registry backup locations were retrieved from a local copyoel6u4 2015/05/02 18:33:41 /u01/app/grid/12.1.0.2/cdata/mycluster/backup00.ocr 0oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/backup01.ocr 0oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/day.ocr 0oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/week.ocr 0PROT-25: Manual backups for the Oracle Cluster Registry are not available |
Just choose the most recent backup and use it to restore the contents of OCR.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [root@oel6u4 ~]# ocrconfig -restore /u01/app/grid/12.1.0.2/cdata/mycluster/backup00.ocr[root@oel6u4 ~]# ocrcheckStatus of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1348 Available space (kbytes) : 408220 ID : 768712202 Device/File Name : +OCR Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded |
4. Restore Voting Disk
Since the voting files are placed in ASM together with OCR, the OCR backup contains a copy of the voting file as well. All I need to do is start CRSD and recreate the voting file.
1 2 3 4 5 6 7 | [root@oel6u4 ~]# crsctl start res ora.crsd -initCRS-2672: Attempting to start 'ora.crsd' on 'oel6u4'CRS-2676: Start of 'ora.crsd' on 'oel6u4' succeeded[root@oel6u4 ~]# crsctl replace votedisk +OCRCRS-4602: Failed 27 to add voting file d1c46046fd004f1abf98e3beb7905baa.Failed to replace voting disk group with +OCR.CRS-4000: Command Replace failed, or completed with errors. |
Not good. But the reason for that is that ASM does not have ASM_DISKSTRING configured. Actually ASM has not a single parameter configured because it is using a spfile stored in OCR diskgroup as well. That means there is no spfile anymore. As a quick solution I set the parameter in memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [oracle@oel6u4 ~]$ sqlplus / as sysasmSQL*Plus: Release 12.1.0.2.0 Production on Mon May 4 11:20:31 2015Copyright (c) 1982, 2014, Oracle. All rights reserved.Connected to:Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit ProductionWith the Real Application Clusters and Automatic Storage Management optionsSQL> show parameter diskstrNAME TYPE------------------------------------ ---------------------------------VALUE------------------------------asm_diskstring stringSQL> alter system set asm_diskstring='/dev/oracleasm/disks/*' scope=memory;System altered. |
With this small change I am now able to recreate the voting file.
1 2 3 4 5 6 7 8 9 | [root@oel6u4 ~]# crsctl replace votedisk +OCRSuccessful addition of voting disk c0cb172eb1d34f9abf04b37c883c9ddd.Successfully replaced voting disk group with +OCR.CRS-4266: Voting file(s) successfully replaced[root@oel6u4 ~]# crsctl query css votedisk## STATE File Universal Id File Name Disk group-- ----- ----------------- --------- --------- 1. ONLINE c0cb172eb1d34f9abf04b37c883c9ddd (/dev/oracleasm/disks/OCR) [OCR]Located 1 voting disk(s). |
5. Restore ASM spfile
This is easy, I don’t have a backup of my ASM spfile so I recreate it from memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [oracle@oel6u4 ~]$ sqlplus / as sysasmSQL*Plus: Release 12.1.0.2.0 Production on Mon May 4 11:27:16 2015Copyright (c) 1982, 2014, Oracle. All rights reserved.Connected to:Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit ProductionWith the Real Application Clusters and Automatic Storage Management optionsSQL> create spfile='+OCR' from memory;File created. |
The GPNP profile get’s updated also by doing so.
6. Restart Grid Infrastructure
I have everything restored that I need to start the clusterware in normal operation mode. Let’s see:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | [root@oel6u4 ~]# crsctl stop crs -fCRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'oel6u4'CRS-2673: Attempting to stop 'ora.crsd' on 'oel6u4'CRS-2677: Stop of 'ora.crsd' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.evmd' on 'oel6u4'CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'oel6u4'CRS-2673: Attempting to stop 'ora.mdnsd' on 'oel6u4'CRS-2673: Attempting to stop 'ora.gpnpd' on 'oel6u4'CRS-2677: Stop of 'ora.drivers.acfs' on 'oel6u4' succeededCRS-2677: Stop of 'ora.evmd' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.crf' on 'oel6u4'CRS-2673: Attempting to stop 'ora.ctssd' on 'oel6u4'CRS-2673: Attempting to stop 'ora.storage' on 'oel6u4'CRS-2677: Stop of 'ora.mdnsd' on 'oel6u4' succeededCRS-2677: Stop of 'ora.gpnpd' on 'oel6u4' succeededCRS-2677: Stop of 'ora.storage' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.asm' on 'oel6u4'CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeededCRS-2677: Stop of 'ora.ctssd' on 'oel6u4' succeededCRS-2677: Stop of 'ora.asm' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'oel6u4'CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.cssd' on 'oel6u4'CRS-2677: Stop of 'ora.cssd' on 'oel6u4' succeededCRS-2673: Attempting to stop 'ora.gipcd' on 'oel6u4'CRS-2677: Stop of 'ora.gipcd' on 'oel6u4' succeededCRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'oel6u4' has completedCRS-4133: Oracle High Availability Services has been stopped. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | [root@oel6u4 ~]# crsctl start has -waitCRS-4123: Starting Oracle High Availability Services-managed resourcesCRS-2672: Attempting to start 'ora.mdnsd' on 'oel6u4'CRS-2672: Attempting to start 'ora.evmd' on 'oel6u4'CRS-2676: Start of 'ora.evmd' on 'oel6u4' succeededCRS-2676: Start of 'ora.mdnsd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.gpnpd' on 'oel6u4'CRS-2676: Start of 'ora.gpnpd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.gipcd' on 'oel6u4'CRS-2676: Start of 'ora.gipcd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cssdmonitor' on 'oel6u4'CRS-2676: Start of 'ora.cssdmonitor' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cssd' on 'oel6u4'CRS-2672: Attempting to start 'ora.diskmon' on 'oel6u4'CRS-2676: Start of 'ora.diskmon' on 'oel6u4' succeededCRS-2676: Start of 'ora.cssd' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'oel6u4'CRS-2672: Attempting to start 'ora.ctssd' on 'oel6u4'CRS-2676: Start of 'ora.ctssd' on 'oel6u4' succeededCRS-2676: Start of 'ora.cluster_interconnect.haip' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.asm' on 'oel6u4'CRS-2676: Start of 'ora.asm' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.storage' on 'oel6u4'CRS-2676: Start of 'ora.storage' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.crf' on 'oel6u4'CRS-2676: Start of 'ora.crf' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.crsd' on 'oel6u4'CRS-2676: Start of 'ora.crsd' on 'oel6u4' succeededCRS-6023: Starting Oracle Cluster Ready Services-managed resourcesCRS-2664: Resource 'ora.OCR.dg' is already running on 'oel6u4'CRS-6017: Processing resource auto-start for servers: oel6u4CRS-2672: Attempting to start 'ora.net1.network' on 'oel6u4'CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'oel6u4'CRS-2672: Attempting to start 'ora.proxy_advm' on 'oel6u4'CRS-2672: Attempting to start 'ora.oc4j' on 'oel6u4'CRS-2676: Start of 'ora.net1.network' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.cvu' on 'oel6u4'CRS-2672: Attempting to start 'ora.oel6u4.vip' on 'oel6u4'CRS-2672: Attempting to start 'ora.ons' on 'oel6u4'CRS-2672: Attempting to start 'ora.scan1.vip' on 'oel6u4'CRS-2676: Start of 'ora.cvu' on 'oel6u4' succeededCRS-2676: Start of 'ora.oel6u4.vip' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'oel6u4'CRS-2676: Start of 'ora.scan1.vip' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'oel6u4'CRS-2676: Start of 'ora.ons' on 'oel6u4' succeededCRS-2676: Start of 'ora.MGMTLSNR' on 'oel6u4' succeededCRS-2676: Start of 'ora.LISTENER.lsnr' on 'oel6u4' succeededCRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.mgmtdb' on 'oel6u4'CRS-2676: Start of 'ora.proxy_advm' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.DATA.dg' on 'oel6u4'CRS-5017: The resource action "ora.mgmtdb start" encountered the following error:ORA-01078: failure in processing system parametersORA-01565: error in identifying file '+OCR/_mgmtdb/spfile-MGMTDB.ora'ORA-17503: ksfdopn:2 Failed to open file +OCR/_mgmtdb/spfile-MGMTDB.oraORA-15056: additional error messageORA-17503: ksfdopn:2 Failed to open file +OCR/_mgmtdb/spfile-mgmtdb.oraORA-15173: entry '_mgmtdb' does not exist in directory '/'ORA-06512: at line 4. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/diag/crs/oel6u4/crs/trace/crsd_oraagent_oracle.trc".CRS-2674: Start of 'ora.mgmtdb' on 'oel6u4' failedCRS-2679: Attempting to clean 'ora.mgmtdb' on 'oel6u4'CRS-2681: Clean of 'ora.mgmtdb' on 'oel6u4' succeededCRS-2676: Start of 'ora.DATA.dg' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.DATA.DATA.advm' on 'oel6u4'CRS-2676: Start of 'ora.oc4j' on 'oel6u4' succeededCRS-2676: Start of 'ora.DATA.DATA.advm' on 'oel6u4' succeededCRS-2672: Attempting to start 'ora.data.data.acfs' on 'oel6u4'CRS-2676: Start of 'ora.data.data.acfs' on 'oel6u4' succeeded===== Summary of resource auto-start failures follows =====CRS-2807: Resource 'ora.mgmtdb' failed to start automatically.CRS-6016: Resource auto-start has completed for server oel6u4CRS-6024: Completed start of Oracle Cluster Ready Services-managed resourcesCRS-4123: Oracle High Availability Services has been started. |
You see, the GIMR (MGMTDB) is gone too. I will talk about that soon. At first, let’s see if all the other ressources are running properly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | [root@oel6u4 ~]# crsctl stat res -t--------------------------------------------------------------------------------Name Target State Server State details--------------------------------------------------------------------------------Local Resources--------------------------------------------------------------------------------ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE oel6u4 STABLEora.DATA.DATA.advm ONLINE ONLINE oel6u4 Volume device /dev/a sm/data-347 is onlin e,STABLEora.DATA.dg ONLINE ONLINE oel6u4 STABLEora.LISTENER.lsnr ONLINE ONLINE oel6u4 STABLEora.OCR.dg ONLINE ONLINE oel6u4 STABLEora.data.data.acfs ONLINE ONLINE oel6u4 mounted on /u02,STAB LEora.net1.network ONLINE ONLINE oel6u4 STABLEora.ons ONLINE ONLINE oel6u4 STABLEora.proxy_advm ONLINE ONLINE oel6u4 STABLE--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE oel6u4 STABLEora.MGMTLSNR 1 ONLINE ONLINE oel6u4 169.254.39.205 192.1 68.1.1,STABLEora.asm 1 ONLINE ONLINE oel6u4 STABLE 2 OFFLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLEora.cvu 1 ONLINE ONLINE oel6u4 STABLEora.mgmtdb 1 ONLINE OFFLINE Instance Shutdown,ST ABLEora.oc4j 1 ONLINE ONLINE oel6u4 STABLEora.oel6u4.vip 1 ONLINE ONLINE oel6u4 STABLEora.scan1.vip 1 ONLINE ONLINE oel6u4 STABLE-------------------------------------------------------------------------------- |
Loooks good so far
7. Restore ASM password file
Since 12c the password file for ASM is stored inside ASM. Again, I have no backup so I need to create it from scratch.
1 2 3 4 5 | [oracle@oel6u4 ~]$ orapwd file=/tmp/orapwasm password=Oracle-1 force=y[oracle@oel6u4 ~]$ asmcmd pwcopy --asm /tmp/orapwasm +OCR/pwdasmcopying /tmp/orapwasm -> +OCR/pwdasm[oracle@oel6u4 ~]$ asmcmd pwget --asm+OCR/pwdasm |
the “pwcopy” updates the GPNP profile to reflect this.
8. Restore GIMR
There is no way to backup the GIMR. You have to recreate it. The Cluster Health Monitor (CHM) must not run during this recreation, it has to be stopped and disabled. Then I need to remove the GIMR cluster ressource.
1 2 3 4 | [root@oel6u4 ~]# crsctl stop res ora.crf -initCRS-2673: Attempting to stop 'ora.crf' on 'oel6u4'CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeeded[root@oel6u4 ~]# crsctl modify res ora.crf -attr ENABLED=0 -init |
1 2 | [oracle@oel6u4 ~]$ srvctl remove mgmtdbRemove the database _mgmtdb? (y/[n]) y |
Then use dbca from the Grid Infrastructure home to create GIMR. First the container database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | [oracle@oel6u4 ~]$ dbca -silent -createDatabase -sid -MGMTDB -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc -gdbName _mgmtdb -storageType ASM -diskGroupName +OCR -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates -characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheckRegistering database with Oracle Grid Infrastructure5% completeCopying database files7% complete9% complete16% complete23% complete30% complete37% complete41% completeCreating and starting Oracle instance43% complete48% complete49% complete50% complete55% complete60% complete61% complete64% completeCompleting Database Creation68% complete79% complete89% complete100% completeLook at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/_mgmtdb2.log" for further details. |
Second, the pluggable database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [oracle@oel6u4 ~]$ dbca -silent -createPluggableDatabase -sourceDB -MGMTDB -pdbName mycluster -createPDBFrom RMANBACKUP -PDBBackUpfile $ORACLE_HOME/assistants/dbca/templates/mgmtseed_pdb.dfb -PDBMetadataFile $ORACLE_HOME/assistants/dbca/templates/mgmtseed_pdb.xml -createAsClone true -internalSkipGIHomeCheckCreating Pluggable Database4% complete12% complete21% complete38% complete55% complete85% completeCompleting Pluggable Database Creation100% completeLook at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/mycluster/_mgmtdb1.log" for further details.[oracle@oel6u4 ~]$ srvctl status mgmtdbDatabase is enabledInstance -MGMTDB is running on node oel6u4 |
1 | [oracle@oel6u4 ~]$ mgmtca |
1 2 3 4 | [root@oel6u4 ~]# crsctl modify res ora.crf -attr ENABLED=1 -init[root@oel6u4 ~]# crsctl start res ora.crf -initCRS-2672: Attempting to start 'ora.crf' on 'oel6u4'CRS-2676: Start of 'ora.crf' on 'oel6u4' succeeded |
Let’s see if everything is running fine again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | [oracle@oel6u4 ~]$ crsctl stat res -t--------------------------------------------------------------------------------Name Target State Server State details--------------------------------------------------------------------------------Local Resources--------------------------------------------------------------------------------ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE oel6u4 STABLEora.DATA.DATA.advm ONLINE ONLINE oel6u4 Volume device /dev/a sm/data-347 is onlin e,STABLEora.DATA.dg ONLINE ONLINE oel6u4 STABLEora.LISTENER.lsnr ONLINE ONLINE oel6u4 STABLEora.OCR.dg ONLINE ONLINE oel6u4 STABLEora.data.data.acfs ONLINE ONLINE oel6u4 mounted on /u02,STAB LEora.net1.network ONLINE ONLINE oel6u4 STABLEora.ons ONLINE ONLINE oel6u4 STABLEora.proxy_advm ONLINE ONLINE oel6u4 STABLE--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE oel6u4 STABLEora.MGMTLSNR 1 ONLINE ONLINE oel6u4 169.254.39.205 192.1 68.1.1,STABLEora.asm 1 ONLINE ONLINE oel6u4 STABLE 2 OFFLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLEora.cvu 1 ONLINE ONLINE oel6u4 STABLEora.mgmtdb 1 ONLINE ONLINE oel6u4 Open,STABLEora.oc4j 1 ONLINE ONLINE oel6u4 STABLEora.oel6u4.vip 1 ONLINE ONLINE oel6u4 STABLEora.scan1.vip 1 ONLINE ONLINE oel6u4 STABLE-------------------------------------------------------------------------------- |
Funny, my ASM now has 3 instances again, I already changed that to “count=all” but obviously the OCR backup was taken before that was done. And I had two databases in place, both ressources are missing for the same reason. But that’s not a major issue.
1 2 3 4 5 6 | [oracle@oel6u4 ~]$ srvctl modify asm -count all[oracle@oel6u4 ~]$ srvctl add database -db db11g -oraclehome /u01/app/oracle/product/11.2.0.4/db -dbtype RAC \> -spfile /u02/app/oracle/oradata/db11g/spfiledb11g.ora -pwfile /u02/app/oracle/oradata/db11g/orapwdb11g \> -serverpool oel6u4 -acfspath /u02PRCR-1039 : Server pool ora.oel6u4 does not exist |
No serverpool, obvious. Serverpools are also defined in OCR.
1 2 3 4 5 6 7 8 | [oracle@oel6u4 ~]$ srvctl add srvpool -serverpool oel6u4 -min 1 -max 1 -category "hub"[oracle@oel6u4 ~]$ srvctl add database -db cdb12c -oraclehome /u01/app/oracle/product/12.1.0.2/db -dbtype RAC \> -spfile /u02/app/oracle/oradata/cdb12c/spfilecdb12c.ora -pwfile /u02/app/oracle/oradata/cdb12c/orapwcdb12c \> -serverpool oel6u4 -acfspath /u02[oracle@oel6u4 ~]$ srvctl add database -d db11g -o /u01/app/oracle/product/11.2.0.4/db -c RAC \> -p /u02/app/oracle/oradata/db11g/spfiledb11g.ora -g oel6u4 -j /u02 |
9. Lessons learned -or- What one should backup
Usually I do metadata backups of all the ASM diskgroups as well as backups of ASM spfile and password file and of cause do periodic backups of OCR and OLR. But it was nice to see, that it is possible to restore everything without any manual backups in place. What should one backup and how:
- ASM metadata: asmcmd md_backup
- ASM spfile: asmcmd spcopy -or- create pfile from spfile
- ASM password file: asmcmd pwcopy
- OCR: configure OCR backup to external storage or copy manually
- OCR: do backups everytime you add/delete/modify cluster ressources
- OLR: not mentioned here since it is stored on disk, but important too
10. References
My Oracle Support
- How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems (Doc ID 1062983.1)
- How to Restore ASM Password File if Lost ( ORA-01017 ORA-15077 ) (Doc ID 1644005.1)
- CRS-4256 CRS-4602 While Replacing Voting Disk (Doc ID 1475588.1)
- How to Move/Recreate GI Management Repository to Different Shared Storage (Diskgroup, CFS or NFS etc) (Doc ID 1589394.1)
No comments:
Post a Comment