Amit's Oracle DBA Blog: Oracle ASM 12c – New Features with examples

This post discusses a few 12C ASM new features with examples:

Failgroup_repair_time
Online disk replace
Estimate Work
V$ASM_OPERATION view

1. FAILGROUP_REPAIR_TIME

For earlier ASM versions, we had disk_repair_time, from 12C onwards we have failgroup_repair_time at the ASM level. Because failure group outages are more likely to be transient in nature and because replacing all the disks in a failure group is a much more expensive operation than replacing a single disk, it would typically make sense for failure groups to have a larger repair time to ensure that all the disks do not get dropped automatically in the event of a failure group outage.

By default, the value of this diskgroup level attribute is 24Hours. To set this attribute at ASM diskgroup level, one must set compatible.asm and compatible.rdbms greater than or equal to 11.1.0.0.

Example

1. Let’s create a new diskgroup and see what will be the value of attributes compatible.asm and compatible.rdbms set by default.

SQL> create diskgroup test_failgroup normal redundancy failgroup A disk '/dev/sde2','/dev/sdh2'
2 failgroup B disk '/dev/sdb4','/dev/sdi1';

Diskgroup created.

2. If you check the values of both the attributes now, it remained same to support backward compatibilities.

SQL> select NAME ,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup where name='TEST_FAILGROUP';

NAME                           COMPATIBILITY                               DATABASE_COMPATIBILITY
==============                 ===============                             ========================
TEST_FAILGROUP                 10.1.0.0.0                                  10.1.0.0.0

3. If you try setting the failgroup_repair_time attribute on this diskgroup, it would fail with below error.

SQL> alter diskgroup test_failgroup set attribute 'failgroup_repair_time'='3H';
alter diskgroup test_failgroup set attribute 'failgroup_repair_time'='3H'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15242: could not set attribute failgroup_repair_time
ORA-15221: ASM operation requires compatible.asm of 11.1.0.0.0 or higher

4. Let us now try to change only the attribute compatible.asm.

SQL> ALTER DISKGROUP TEST_FAILGROUP SET ATTRIBUTE 'COMPATIBLE.ASM'='11.1.0.0';

Diskgroup altered.

SQL> select NAME ,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup where name='TEST_FAILGROUP';

NAME                           COMPATIBILITY                               DATABASE_COMPATIBILITY
==============                 =============                               ======================
TEST_FAILGROUP                 11.1.0.0.0                                  10.1.0.0.0

5. Try setting the attribute value now. You should still not be able to set the attribute value.

SQL> alter diskgroup test_failgroup set attribute 'failgroup_repair_time'='3H';
alter diskgroup test_failgroup set attribute 'failgroup_repair_time'='3H'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15242: could not set attribute failgroup_repair_time
ORA-15283: ASM operation requires compatible.rdbms of 11.1.0.0.0 or higher

6. Now set compatible.rdbms to 11.1.0.0 and try setting the attribute value again. It should succeed this time.

SQL> ALTER DISKGROUP TEST_FAILGROUP SET ATTRIBUTE 'COMPATIBLE.RDBMS'='11.1.0.0';

Diskgroup altered.

SQL> select NAME ,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup where name='TEST_FAILGROUP';

NAME                           COMPATIBILITY                               DATABASE_COMPATIBILITY
==============                 =============                               ======================= 
TEST_FAILGROUP                 11.1.0.0.0                                  11.1.0.0.0

SQL> alter diskgroup test_failgroup set attribute 'failgroup_repair_time'='3H';

Diskgroup altered.

### Validate this change ###

SQL> select group_number,name,value from v$asm_attribute where group_number=4 and name like 'failgroup%';

GROUP_NUMBER                         NAME                             VALUE
============                   =====================                  =====
4                              failgroup_repair_time                  3H

2. Online disk replace

If an ASM disk becomes offline and cannot be repaired, administrators need the ability to replace the disk. In versions of ASM prior to 12c, there was no replace command. Rather, administrators had to drop the faulty disk and then add a new one back into the disk group which causes an ASM level rebalance. Depending on multiple internal and external factors, reblance can be time consuming.

In 12c, ASM allows DBAs to simply replace an offline disk using one fast and efficient operation. There is no need for any additional reorganization or rebalancing across the rest of the disk group. The pre-requisites for using these feature are:

ASM diskgroup level attribute compatible.asm should be of 12.1.0.0 for this feature
Replacing disk should bad in true sense , ASM will not replace online disk.
The replacement disk takes the same name as the original disk and becomes part of the same failure group as the original disk.
Then replacing good disk should be sized equal or greater than replacing bad disk.Else replace will fail.

Example

1. If you try to replace a disk with compatible.asm attribute value as 10.1.0.0, it would result in below error.

SQL> alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1';
alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15221: ASM operation requires compatible.asm of 12.1.0.0.0 or higher

2. Now try with compatible.asm attribute value as 12.1.0.0.

SQL> alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1';
alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15145: ASM disk 'TEST_FAILGROUP_0002' is online and cannot be replaced.

It failed again but with a different error as the disk we are trying to replace is an online disk. We need to change the permission of that disk prior to replacement.

SQL> !chmod 000 /dev/sdb4

Try the replacement again now.

SQL> alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1';

Diskgroup altered.

3. From alert log file:

Tue Jul 30 12:29:34 2013
GMON updating disk modes for group 4 at 39 for pid 42, osid 9205
NOTE: group TEST_FAILGROUP: updated PST location: disk 0000 (PST copy 0)
NOTE: group TEST_FAILGROUP: updated PST location: disk 0003 (PST copy 1)
Tue Jul 30 12:29:34 2013
NOTE: PST update grp = 4 completed successfully 
NOTE: initiating PST update: grp 4 (TEST_FAILGROUP), dsk = 2/0x0, mask = 0x7d, op = assign
Tue Jul 30 12:29:34 2013
GMON updating disk modes for group 4 at 40 for pid 42, osid 9205
NOTE: group TEST_FAILGROUP: updated PST location: disk 0000 (PST copy 0)
NOTE: group TEST_FAILGROUP: updated PST location: disk 0003 (PST copy 1)
Tue Jul 30 12:29:34 2013
NOTE: PST update grp = 4 completed successfully 
Tue Jul 30 12:29:34 2013
NOTE: Voting File refresh pending for group 4/0x60869f8 (TEST_FAILGROUP)
NOTE: Attempting voting file refresh on diskgroup TEST_FAILGROUP
Tue Jul 30 12:29:34 2013
SUCCESS: alter diskgroup test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdg1'
Tue Jul 30 12:29:34 2013
NOTE: starting rebalance of group 4/0x60869f8 (TEST_FAILGROUP) at power 1
Starting background process ARB0

4. Validate that the disk replacement is completed and the new disk has taken the same name as old disk.

SQL> select name,path,total_mb from v$asm_disk where group_number=4;

NAME                                                 PATH                               TOTAL_MB

TEST_FAILGROUP_0003           /dev/sdi1                               2541
TEST_FAILGROUP_0001           /dev/sdh2                              2565
TEST_FAILGROUP_0002           /dev/sdg1                              2165
TEST_FAILGROUP_0000          /dev/sde2                               2565

5. Added disk was lager in size of original disk. But at ASM level only same size allocated as old bad disk.

SQL> select name,path,total_mb,os_mb from v$asm_disk where group_number=4;

NAME                         PATH                               TOTAL_MB                        OS_MB 
===================          =========                          ========                        =====
TEST_FAILGROUP_0003          /dev/sdi1                          2541                            2541  
TEST_FAILGROUP_0001          /dev/sdh2                          2565                            2565
TEST_FAILGROUP_0002          /dev/sdg1                          2165                            2541
TEST_FAILGROUP_0000          /dev/sde2                          2565                            2565

6. Simulate similar type of issue and try to replace with less sized disk than bad disk. It will not allow replacement and would throw below error:

SQL> alter diskgroup Test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdb4';
alter diskgroup Test_failgroup replace disk TEST_FAILGROUP_0002 with '/dev/sdb4'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15408: Replacement disk for 'TEST_FAILGROUP_0002' must be at least 2541 M.

3. Estimate Work

Before 12C, the only way to know how long an expensive operation like disk add/drop at diskgroup level was to actually perform the operation, so there was no way to predict before-hand how much time this would take.

In 12c ASM, a more detailed and more accurate work plan is created at the beginning of each rebalance operation. In addition, DBAs can separately generate and view the work plan before performing a rebalance operation. This allows DBA to better plan and execute various changes such as adding storage, removing storage or moving between different storage systems. DBAs can generate the work plan using the ESTIMATE WORK command.

Querying from V$ASM_ESTIMATE view give an idea of required time of that operation based on current workload on the system. This feature allows DBAs to consider system load implications while planning an operation before actually taxing the system.

Note: be sure to give the correct disk name, otherwise this estimation will fail

Example

1. Lets see what happens when you give a non-existing disk for estimation.

SQL> explain work set statement_id='drop_test_failgroup_0002' for alter diskgroup test_failgroup 
2 drop disk test_failgroup_0005;
explain work set statement_id='drop_test_failgroup_0005' for alter diskgroup test_failgroup
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15054: disk "TEST_FAILGROUP_0005" does not exist in diskgroup
"TEST_FAILGROUP"

2. When the disk name is correct, the estimated time is as shown below:

SQL> explain work set statement_id='drop_test_failgroup_0000' for alter diskgroup test_failgroup 
2 drop disk test_failgroup_0000;

Explained.

SQL> select est_work from v$asm_estimate where statement_id='drop_test_failgroup_0000';

EST_WORK
----------
42

4. V$ASM_OPEARATION view

From version 11g onwards, we have a compact phase while rebalancing. While this phase used to run at ASM level for a diskgroup, we cannot estimate compact phase timing approximately.

We used to see EST_MINUTES is 0. At the same time we were also not able to track resync operations separately. In 12C, we can view these phases separately from v$asm_operation, giving a better idea about on-going operation based on current utilization of the system. For example:

SQL> SELECT PASS, STATE, EST_MINUTES FROM V$ASM_OPERATION;

PASS                     STAT             EST_MINUTES
=========                ====             ===========
RESYNC                   RUN                     1
REBALANCE                WAIT                    20
COMPACT                  WAIT                    2

Amit's Oracle DBA Blog

Disclaimer

Wednesday, 19 January 2022

Oracle ASM 12c – New Features with examples

1. FAILGROUP_REPAIR_TIME

Example

2. Online disk replace

Example

3. Estimate Work

Example

4. V$ASM_OPEARATION view

No comments:

Post a Comment

Oracle Exadata

Labels