As you know, Oracle Grid Infrastructure needs several internal stuff to run. There are voting disks, the cluster registry (OCR) and since 12c there is a Grid Infrastructure Management Repository (GIMR). All of these are stored together in an ASM diskgroup, at least if you did not separated things later on.
But what happens if this diskgroup get lost? The cluster stack will not come up since some very fundamental ressources are missing now. I tested this case a couple of years ago when 11.2 came out and all the cluster stuff was moved into an ASM diskgroup. And I needed to test again with 12c because my freshly installed VM lost it’s shared disk that I used for the OCR diskgroup. So I was forced to recover my installation.
Be careful with the OS users you use to do things. Some steps need root privileges, other steps are executed as the Grid Infrastructure owner (oracle in my case). I included the prompt which reflects the user I used to perform the particular step.
1. Status
The cluster stack does not come up since there are no voting disks anymore.
Cluster alert.log shows:
1 | 2015-05-04 10:32:48.255 [OCSSD(31416)]CRS-1714: Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/oracle/diag/crs/oel6u4/crs/trace/ocssd.trc |
And the internal processes look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | [root@oel6u4 ~]# crsctl stat res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE OFFLINE STABLE ora.cluster_interconnect.haip 1 ONLINE OFFLINE STABLE ora.crf 1 ONLINE OFFLINE STABLE ora.crsd 1 ONLINE OFFLINE STABLE ora.cssd 1 ONLINE OFFLINE oel6u4 STARTING ora.cssdmonitor 1 ONLINE ONLINE oel6u4 STABLE ora.ctssd 1 ONLINE OFFLINE STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.drivers.acfs 1 ONLINE ONLINE oel6u4 STABLE ora.evmd 1 ONLINE INTERMEDIATE oel6u4 STABLE ora.gipcd 1 ONLINE ONLINE oel6u4 STABLE ora.gpnpd 1 ONLINE ONLINE oel6u4 STABLE ora.mdnsd 1 ONLINE ONLINE oel6u4 STABLE ora.storage 1 ONLINE OFFLINE STABLE -------------------------------------------------------------------------------- |
2. Recreate OCR diskgroup
At first I made my disk available again using ASMlib:
1 2 3 4 5 6 7 8 | [root@oel6u4 ~]# oracleasm listdisks DATA [root@oel6u4 ~]# oracleasm createdisk OCR /dev/sdc1 Writing disk header: done Instantiating disk: done [root@oel6u4 ~]# oracleasm listdisks DATA OCR |
Now I need to restore my ASM diskgroup, but I that requires a running ASM instance to do that. So stop the cluster stack and start again in exclusive mode. By the way, “crsctl stop crs -f” did not finish so I disabled the cluster stack by issuing “crsctl disable has” and reboot.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | [root@oel6u4 ~]# crsctl enable has CRS-4622: Oracle High Availability Services autostart is enabled. [root@oel6u4 ~]# crsctl start crs -excl CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.evmd' on 'oel6u4' CRS-2672: Attempting to start 'ora.mdnsd' on 'oel6u4' CRS-2676: Start of 'ora.evmd' on 'oel6u4' succeeded CRS-2676: Start of 'ora.mdnsd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'oel6u4' CRS-2676: Start of 'ora.gpnpd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'oel6u4' CRS-2672: Attempting to start 'ora.gipcd' on 'oel6u4' CRS-2676: Start of 'ora.cssdmonitor' on 'oel6u4' succeeded CRS-2676: Start of 'ora.gipcd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'oel6u4' CRS-2672: Attempting to start 'ora.diskmon' on 'oel6u4' CRS-2676: Start of 'ora.diskmon' on 'oel6u4' succeeded CRS-2676: Start of 'ora.cssd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.crf' on 'oel6u4' CRS-2672: Attempting to start 'ora.ctssd' on 'oel6u4' CRS-2672: Attempting to start 'ora.drivers.acfs' on 'oel6u4' CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'oel6u4' CRS-2676: Start of 'ora.crf' on 'oel6u4' succeeded CRS-2676: Start of 'ora.ctssd' on 'oel6u4' succeeded CRS-2676: Start of 'ora.drivers.acfs' on 'oel6u4' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.asm' on 'oel6u4' CRS-2676: Start of 'ora.asm' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.storage' on 'oel6u4' diskgroup OCR not mounted () CRS-5017: The resource action "ora.storage start" encountered the following error: Storage agent start action aborted. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/diag/crs/oel6u4/crs/trace/ohasd_orarootagent_root.trc". CRS-2674: Start of 'ora.storage' on 'oel6u4' failed CRS-2679: Attempting to clean 'ora.storage' on 'oel6u4' CRS-2681: Clean of 'ora.storage' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'oel6u4' CRS-2677: Stop of 'ora.asm' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'oel6u4' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'oel6u4' CRS-2677: Stop of 'ora.drivers.acfs' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.ctssd' on 'oel6u4' CRS-2677: Stop of 'ora.ctssd' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.crf' on 'oel6u4' CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeeded CRS-4000: Command Start failed, or completed with errors. |
as you see the startup fails since “ora.storage” is not able to locate the OCR diskgroup. That means there is a timeframe of about 10 minutes to create the diskgroup during startup of “ora.storage”.
If I would have made a backup of my ASM diskgroup I could have used that. But I have not. That’s why I create my OCR diskgroup from scratch. Start the CRS again and then do the following from a second session:
1 2 3 4 5 6 7 8 9 10 11 | [root@oel6u4 ~]# cat ocr.dg <dg name="ocr" redundancy="external"> <dsk string="/dev/oracleasm/disks/OCR" quorum="QUORUM" /> <a name="compatible.asm" value="12.1.0.2.0" /> <a name="compatible.rdbms" value="12.1.0.2.0" /> </dg> [root@oel6u4 ~]# asmcmd mkdg ~/ocr.dg [root@oel6u4 ~]# asmcmd lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 1048576 12287 12234 0 12234 0 N OCR/ |
So far, so good. The OCR diskgroup is there and it is mounted. At this point the “ora.storage” manages to come up successfully.
3. Restore OCR
Next step is restoring the OCR from backup. Fortunately the clusterware creates backups of the OCR by itself right from the beginning.
1 2 3 4 5 6 7 8 9 10 11 | [root@oel6u4 ~]# ocrconfig -showbackup PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy oel6u4 2015/05/02 18:33:41 /u01/app/grid/12.1.0.2/cdata/mycluster/backup00.ocr 0 oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/backup01.ocr 0 oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/day.ocr 0 oel6u4 2015/05/02 14:33:17 /u01/app/grid/12.1.0.2/cdata/mycluster/week.ocr 0 PROT-25: Manual backups for the Oracle Cluster Registry are not available |
Just choose the most recent backup and use it to restore the contents of OCR.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [root@oel6u4 ~]# ocrconfig -restore /u01/app/grid/12.1.0.2/cdata/mycluster/backup00.ocr [root@oel6u4 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1348 Available space (kbytes) : 408220 ID : 768712202 Device/File Name : +OCR Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded |
4. Restore Voting Disk
Since the voting files are placed in ASM together with OCR, the OCR backup contains a copy of the voting file as well. All I need to do is start CRSD and recreate the voting file.
1 2 3 4 5 6 7 | [root@oel6u4 ~]# crsctl start res ora.crsd -init CRS-2672: Attempting to start 'ora.crsd' on 'oel6u4' CRS-2676: Start of 'ora.crsd' on 'oel6u4' succeeded [root@oel6u4 ~]# crsctl replace votedisk +OCR CRS-4602: Failed 27 to add voting file d1c46046fd004f1abf98e3beb7905baa. Failed to replace voting disk group with +OCR. CRS-4000: Command Replace failed, or completed with errors. |
Not good. But the reason for that is that ASM does not have ASM_DISKSTRING configured. Actually ASM has not a single parameter configured because it is using a spfile stored in OCR diskgroup as well. That means there is no spfile anymore. As a quick solution I set the parameter in memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [oracle@oel6u4 ~]$ sqlplus / as sysasm SQL*Plus: Release 12.1.0.2.0 Production on Mon May 4 11:20:31 2015 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstr NAME TYPE ------------------------------------ --------------------------------- VALUE ------------------------------ asm_diskstring string SQL> alter system set asm_diskstring='/dev/oracleasm/disks/*' scope=memory; System altered. |
With this small change I am now able to recreate the voting file.
1 2 3 4 5 6 7 8 9 | [root@oel6u4 ~]# crsctl replace votedisk +OCR Successful addition of voting disk c0cb172eb1d34f9abf04b37c883c9ddd. Successfully replaced voting disk group with +OCR. CRS-4266: Voting file(s) successfully replaced [root@oel6u4 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE c0cb172eb1d34f9abf04b37c883c9ddd (/dev/oracleasm/disks/OCR) [OCR] Located 1 voting disk(s). |
5. Restore ASM spfile
This is easy, I don’t have a backup of my ASM spfile so I recreate it from memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [oracle@oel6u4 ~]$ sqlplus / as sysasm SQL*Plus: Release 12.1.0.2.0 Production on Mon May 4 11:27:16 2015 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> create spfile='+OCR' from memory; File created. |
The GPNP profile get’s updated also by doing so.
6. Restart Grid Infrastructure
I have everything restored that I need to start the clusterware in normal operation mode. Let’s see:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | [root@oel6u4 ~]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'oel6u4' CRS-2673: Attempting to stop 'ora.crsd' on 'oel6u4' CRS-2677: Stop of 'ora.crsd' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.evmd' on 'oel6u4' CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'oel6u4' CRS-2673: Attempting to stop 'ora.mdnsd' on 'oel6u4' CRS-2673: Attempting to stop 'ora.gpnpd' on 'oel6u4' CRS-2677: Stop of 'ora.drivers.acfs' on 'oel6u4' succeeded CRS-2677: Stop of 'ora.evmd' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.crf' on 'oel6u4' CRS-2673: Attempting to stop 'ora.ctssd' on 'oel6u4' CRS-2673: Attempting to stop 'ora.storage' on 'oel6u4' CRS-2677: Stop of 'ora.mdnsd' on 'oel6u4' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'oel6u4' succeeded CRS-2677: Stop of 'ora.storage' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'oel6u4' CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeeded CRS-2677: Stop of 'ora.ctssd' on 'oel6u4' succeeded CRS-2677: Stop of 'ora.asm' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'oel6u4' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'oel6u4' CRS-2677: Stop of 'ora.cssd' on 'oel6u4' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'oel6u4' CRS-2677: Stop of 'ora.gipcd' on 'oel6u4' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'oel6u4' has completed CRS-4133: Oracle High Availability Services has been stopped. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | [root@oel6u4 ~]# crsctl start has -wait CRS-4123: Starting Oracle High Availability Services-managed resources CRS-2672: Attempting to start 'ora.mdnsd' on 'oel6u4' CRS-2672: Attempting to start 'ora.evmd' on 'oel6u4' CRS-2676: Start of 'ora.evmd' on 'oel6u4' succeeded CRS-2676: Start of 'ora.mdnsd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'oel6u4' CRS-2676: Start of 'ora.gpnpd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.gipcd' on 'oel6u4' CRS-2676: Start of 'ora.gipcd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'oel6u4' CRS-2676: Start of 'ora.cssdmonitor' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'oel6u4' CRS-2672: Attempting to start 'ora.diskmon' on 'oel6u4' CRS-2676: Start of 'ora.diskmon' on 'oel6u4' succeeded CRS-2676: Start of 'ora.cssd' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'oel6u4' CRS-2672: Attempting to start 'ora.ctssd' on 'oel6u4' CRS-2676: Start of 'ora.ctssd' on 'oel6u4' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.asm' on 'oel6u4' CRS-2676: Start of 'ora.asm' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.storage' on 'oel6u4' CRS-2676: Start of 'ora.storage' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.crf' on 'oel6u4' CRS-2676: Start of 'ora.crf' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.crsd' on 'oel6u4' CRS-2676: Start of 'ora.crsd' on 'oel6u4' succeeded CRS-6023: Starting Oracle Cluster Ready Services-managed resources CRS-2664: Resource 'ora.OCR.dg' is already running on 'oel6u4' CRS-6017: Processing resource auto-start for servers: oel6u4 CRS-2672: Attempting to start 'ora.net1.network' on 'oel6u4' CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'oel6u4' CRS-2672: Attempting to start 'ora.proxy_advm' on 'oel6u4' CRS-2672: Attempting to start 'ora.oc4j' on 'oel6u4' CRS-2676: Start of 'ora.net1.network' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.cvu' on 'oel6u4' CRS-2672: Attempting to start 'ora.oel6u4.vip' on 'oel6u4' CRS-2672: Attempting to start 'ora.ons' on 'oel6u4' CRS-2672: Attempting to start 'ora.scan1.vip' on 'oel6u4' CRS-2676: Start of 'ora.cvu' on 'oel6u4' succeeded CRS-2676: Start of 'ora.oel6u4.vip' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'oel6u4' CRS-2676: Start of 'ora.scan1.vip' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'oel6u4' CRS-2676: Start of 'ora.ons' on 'oel6u4' succeeded CRS-2676: Start of 'ora.MGMTLSNR' on 'oel6u4' succeeded CRS-2676: Start of 'ora.LISTENER.lsnr' on 'oel6u4' succeeded CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.mgmtdb' on 'oel6u4' CRS-2676: Start of 'ora.proxy_advm' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.DATA.dg' on 'oel6u4' CRS-5017: The resource action "ora.mgmtdb start" encountered the following error: ORA-01078: failure in processing system parameters ORA-01565: error in identifying file '+OCR/_mgmtdb/spfile-MGMTDB.ora' ORA-17503: ksfdopn:2 Failed to open file +OCR/_mgmtdb/spfile-MGMTDB.ora ORA-15056: additional error message ORA-17503: ksfdopn:2 Failed to open file +OCR/_mgmtdb/spfile-mgmtdb.ora ORA-15173: entry '_mgmtdb' does not exist in directory '/' ORA-06512: at line 4 . For details refer to "(:CLSN00107:)" in "/u01/app/oracle/diag/crs/oel6u4/crs/trace/crsd_oraagent_oracle.trc". CRS-2674: Start of 'ora.mgmtdb' on 'oel6u4' failed CRS-2679: Attempting to clean 'ora.mgmtdb' on 'oel6u4' CRS-2681: Clean of 'ora.mgmtdb' on 'oel6u4' succeeded CRS-2676: Start of 'ora.DATA.dg' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.DATA.DATA.advm' on 'oel6u4' CRS-2676: Start of 'ora.oc4j' on 'oel6u4' succeeded CRS-2676: Start of 'ora.DATA.DATA.advm' on 'oel6u4' succeeded CRS-2672: Attempting to start 'ora.data.data.acfs' on 'oel6u4' CRS-2676: Start of 'ora.data.data.acfs' on 'oel6u4' succeeded ===== Summary of resource auto-start failures follows ===== CRS-2807: Resource 'ora.mgmtdb' failed to start automatically. CRS-6016: Resource auto-start has completed for server oel6u4 CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources CRS-4123: Oracle High Availability Services has been started. |
You see, the GIMR (MGMTDB) is gone too. I will talk about that soon. At first, let’s see if all the other ressources are running properly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | [root@oel6u4 ~]# crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE oel6u4 STABLE ora.DATA.DATA.advm ONLINE ONLINE oel6u4 Volume device /dev/a sm/data-347 is onlin e,STABLE ora.DATA.dg ONLINE ONLINE oel6u4 STABLE ora.LISTENER.lsnr ONLINE ONLINE oel6u4 STABLE ora.OCR.dg ONLINE ONLINE oel6u4 STABLE ora.data.data.acfs ONLINE ONLINE oel6u4 mounted on /u02,STAB LE ora.net1.network ONLINE ONLINE oel6u4 STABLE ora.ons ONLINE ONLINE oel6u4 STABLE ora.proxy_advm ONLINE ONLINE oel6u4 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE oel6u4 STABLE ora.MGMTLSNR 1 ONLINE ONLINE oel6u4 169.254.39.205 192.1 68.1.1,STABLE ora.asm 1 ONLINE ONLINE oel6u4 STABLE 2 OFFLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLE ora.cvu 1 ONLINE ONLINE oel6u4 STABLE ora.mgmtdb 1 ONLINE OFFLINE Instance Shutdown,ST ABLE ora.oc4j 1 ONLINE ONLINE oel6u4 STABLE ora.oel6u4.vip 1 ONLINE ONLINE oel6u4 STABLE ora.scan1.vip 1 ONLINE ONLINE oel6u4 STABLE -------------------------------------------------------------------------------- |
Loooks good so far
7. Restore ASM password file
Since 12c the password file for ASM is stored inside ASM. Again, I have no backup so I need to create it from scratch.
1 2 3 4 5 | [oracle@oel6u4 ~]$ orapwd file=/tmp/orapwasm password=Oracle-1 force=y [oracle@oel6u4 ~]$ asmcmd pwcopy --asm /tmp/orapwasm +OCR/pwdasm copying /tmp/orapwasm -> +OCR/pwdasm [oracle@oel6u4 ~]$ asmcmd pwget --asm +OCR/pwdasm |
the “pwcopy” updates the GPNP profile to reflect this.
8. Restore GIMR
There is no way to backup the GIMR. You have to recreate it. The Cluster Health Monitor (CHM) must not run during this recreation, it has to be stopped and disabled. Then I need to remove the GIMR cluster ressource.
1 2 3 4 | [root@oel6u4 ~]# crsctl stop res ora.crf -init CRS-2673: Attempting to stop 'ora.crf' on 'oel6u4' CRS-2677: Stop of 'ora.crf' on 'oel6u4' succeeded [root@oel6u4 ~]# crsctl modify res ora.crf -attr ENABLED=0 -init |
1 2 | [oracle@oel6u4 ~]$ srvctl remove mgmtdb Remove the database _mgmtdb? (y/[n]) y |
Then use dbca from the Grid Infrastructure home to create GIMR. First the container database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | [oracle@oel6u4 ~]$ dbca -silent -createDatabase -sid -MGMTDB -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc -gdbName _mgmtdb -storageType ASM -diskGroupName +OCR -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates -characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheck Registering database with Oracle Grid Infrastructure 5% complete Copying database files 7% complete 9% complete 16% complete 23% complete 30% complete 37% complete 41% complete Creating and starting Oracle instance 43% complete 48% complete 49% complete 50% complete 55% complete 60% complete 61% complete 64% complete Completing Database Creation 68% complete 79% complete 89% complete 100% complete Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/_mgmtdb2.log" for further details. |
Second, the pluggable database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [oracle@oel6u4 ~]$ dbca -silent -createPluggableDatabase -sourceDB -MGMTDB -pdbName mycluster -createPDBFrom RMANBACKUP -PDBBackUpfile $ORACLE_HOME/assistants/dbca/templates/mgmtseed_pdb.dfb -PDBMetadataFile $ORACLE_HOME/assistants/dbca/templates/mgmtseed_pdb.xml -createAsClone true -internalSkipGIHomeCheck Creating Pluggable Database 4% complete 12% complete 21% complete 38% complete 55% complete 85% complete Completing Pluggable Database Creation 100% complete Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/mycluster/_mgmtdb1.log" for further details. [oracle@oel6u4 ~]$ srvctl status mgmtdb Database is enabled Instance -MGMTDB is running on node oel6u4 |
1 | [oracle@oel6u4 ~]$ mgmtca |
1 2 3 4 | [root@oel6u4 ~]# crsctl modify res ora.crf -attr ENABLED=1 -init [root@oel6u4 ~]# crsctl start res ora.crf -init CRS-2672: Attempting to start 'ora.crf' on 'oel6u4' CRS-2676: Start of 'ora.crf' on 'oel6u4' succeeded |
Let’s see if everything is running fine again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | [oracle@oel6u4 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE oel6u4 STABLE ora.DATA.DATA.advm ONLINE ONLINE oel6u4 Volume device /dev/a sm/data-347 is onlin e,STABLE ora.DATA.dg ONLINE ONLINE oel6u4 STABLE ora.LISTENER.lsnr ONLINE ONLINE oel6u4 STABLE ora.OCR.dg ONLINE ONLINE oel6u4 STABLE ora.data.data.acfs ONLINE ONLINE oel6u4 mounted on /u02,STAB LE ora.net1.network ONLINE ONLINE oel6u4 STABLE ora.ons ONLINE ONLINE oel6u4 STABLE ora.proxy_advm ONLINE ONLINE oel6u4 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE oel6u4 STABLE ora.MGMTLSNR 1 ONLINE ONLINE oel6u4 169.254.39.205 192.1 68.1.1,STABLE ora.asm 1 ONLINE ONLINE oel6u4 STABLE 2 OFFLINE OFFLINE STABLE 3 OFFLINE OFFLINE STABLE ora.cvu 1 ONLINE ONLINE oel6u4 STABLE ora.mgmtdb 1 ONLINE ONLINE oel6u4 Open,STABLE ora.oc4j 1 ONLINE ONLINE oel6u4 STABLE ora.oel6u4.vip 1 ONLINE ONLINE oel6u4 STABLE ora.scan1.vip 1 ONLINE ONLINE oel6u4 STABLE -------------------------------------------------------------------------------- |
Funny, my ASM now has 3 instances again, I already changed that to “count=all” but obviously the OCR backup was taken before that was done. And I had two databases in place, both ressources are missing for the same reason. But that’s not a major issue.
1 2 3 4 5 6 | [oracle@oel6u4 ~]$ srvctl modify asm -count all [oracle@oel6u4 ~]$ srvctl add database -db db11g -oraclehome /u01/app/oracle/product/11.2.0.4/db -dbtype RAC \ > -spfile /u02/app/oracle/oradata/db11g/spfiledb11g.ora -pwfile /u02/app/oracle/oradata/db11g/orapwdb11g \ > -serverpool oel6u4 -acfspath /u02 PRCR-1039 : Server pool ora.oel6u4 does not exist |
No serverpool, obvious. Serverpools are also defined in OCR.
1 2 3 4 5 6 7 8 | [oracle@oel6u4 ~]$ srvctl add srvpool -serverpool oel6u4 -min 1 -max 1 -category "hub" [oracle@oel6u4 ~]$ srvctl add database -db cdb12c -oraclehome /u01/app/oracle/product/12.1.0.2/db -dbtype RAC \ > -spfile /u02/app/oracle/oradata/cdb12c/spfilecdb12c.ora -pwfile /u02/app/oracle/oradata/cdb12c/orapwcdb12c \ > -serverpool oel6u4 -acfspath /u02 [oracle@oel6u4 ~]$ srvctl add database -d db11g -o /u01/app/oracle/product/11.2.0.4/db -c RAC \ > -p /u02/app/oracle/oradata/db11g/spfiledb11g.ora -g oel6u4 -j /u02 |
9. Lessons learned -or- What one should backup
Usually I do metadata backups of all the ASM diskgroups as well as backups of ASM spfile and password file and of cause do periodic backups of OCR and OLR. But it was nice to see, that it is possible to restore everything without any manual backups in place. What should one backup and how:
- ASM metadata: asmcmd md_backup
- ASM spfile: asmcmd spcopy -or- create pfile from spfile
- ASM password file: asmcmd pwcopy
- OCR: configure OCR backup to external storage or copy manually
- OCR: do backups everytime you add/delete/modify cluster ressources
- OLR: not mentioned here since it is stored on disk, but important too
10. References
My Oracle Support
- How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems (Doc ID 1062983.1)
- How to Restore ASM Password File if Lost ( ORA-01017 ORA-15077 ) (Doc ID 1644005.1)
- CRS-4256 CRS-4602 While Replacing Voting Disk (Doc ID 1475588.1)
- How to Move/Recreate GI Management Repository to Different Shared Storage (Diskgroup, CFS or NFS etc) (Doc ID 1589394.1)
No comments:
Post a Comment