Oracle Cache Fusion is one of the important component in Oracle RAC environment.
- What is Cache Fusion
- How Cache Fusion works?
- How Oracle RAC read block from disk or buffer of another Instance using Cache Fusion?
- How Oracle RAC maintain blocks in buffer across the cluster for DML?
What is Cache Fusion:
Oracle RAC is composed of two or more instances. Oracle RAC Cache Fusion is mechanism to transfer data blocks from the one instance’s buffer cache to another instance’s buffer cache across the cluster interconnect. When a block of data is read from disk by an instance within the cluster and another instance is in need of the same block, it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk. Transfers across this private network are in order of magnitude better than disk I/O activity and they help Oracle RAC performance.
Cache Fusion Ships Blocks from Cache to Cache Across the Interconnect:
Buffer Cache, Shared Pool and the Undo Tablespace like single-instance database are required to facilitate the Cache Fusion. But Extra coordination is needed in the cluster to make a collection of instances work together.
Before explained how Cache Fusion works, We will know –
Global Resource Directory (GRD): On RAC, GRD process keeps track of the resources in the cluster. There is no true concept of a master node in Oracle RAC but each instance of cluster can becomes the resources mater.
- Track master instances of all buffers.
- GRD is present on all the instances of the cluster.
How to find the master node for a resource in RAC:
SQL> SELECT DATA_OBJECT_ID, OBJECT_ID FROM DBA_OBJECTS
WHERE OBJECT_NAME = 'TBL_CFUSION';
DATA_OBJECT_ID OBJECT_ID
-------------- ----------
93836 93836
SQL> SELECT b.dbablk, r.kjblmaster master_node
FROM x$le l, x$kjbl r, x$bh b
WHERE b.le_addr = l.le_addr and l.le_kjbl = r.kjbllockp and b.obj = &Data_Object_ID;
Enter value for data_object_id: 93836
old 3: WHERE b.obj = &Data_Object_ID and b.le_addr = l.le_addr and l.le_kjbl = r.kjbllockp
new 3: WHERE b.obj = 93836 and b.le_addr = l.le_addr and l.le_kjbl = r.kjbllockp
DBABLK MASTER_NODE
---------- -----------
225 1
230 1
227 1
224 1
229 1
226 1
231 1
228 1
8 rows selected.
Global Cache Services (GCS): Global Cache Services are responsible to transfer blocks from one instance to another. A single-instance database relies on enqueues (locks) to protect for modifying the same records by two processes simultaneously. Buffer Cache on all instances appear to be global and enqueues on the resources is also global across the cluster.
LMS – LMS is a GCS process. This process used to called the Lock Manager Server.
Global Enqueue Services (GES): Global Enqueue Services is responsible for managing locks across the cluster. GES was previously called the Distributed/Dynamic Lock Manager (DLM).
- Holds the information on the locks on the buffers.
- Each lock has a name shown in V$LOCK_ELEMENT (or X$LE).
- If a buffer is locked, the lock element name is shown in V$BH.LOCK_ELEMENT.
Note::: MODE_HELD : Platform dependent value for lock mode held; often: 3 = share; 5 = exclusive
How GES Workloads Affect Performance:
Calculate the ratio of local-to-remote global enqueue resource operations using this query:
SELECT
r.CONVERT_TYPE,
SUM(r.AVERAGE_CONVERT_TIME),
SUM(l.AVERAGE_CONVERT_TIME),
SUM(r.CONVERT_COUNT),
SUM(l.CONVERT_COUNT)
FROM V$GES_CONVERT_LOCAL l, V$GES_CONVERT_REMOTE r
WHERE r.convert_count <> 0 OR l.convert_count <> 0
GROUP BY r.CONVERT_TYPE;
LMON: LMON process is the GES master process.
LMD – Lock Manager Daemon. This process manages incoming lock requests.
LCK0 – The instance enqueue process. This process manages lock requests for library Cache objects.
How Cache Fusion Works:
How Oracle RAC read block from disk or buffer of other Instance using Cache Fusion?
- Object is not available on Buffer. See the output from V$BH.
- 1st time run select query – SELECT ID FROM SAMAD.TBL_CFUSION
- Collect sql_trace of the query and generate TKPROF report.
Node 1:
When we run any query in database first time or blocks are not available in cache then all blocks read from database file for rac and stand alone database.
TKPROF output. There are 6 IOs from disk of this query.
Node 2:
- Collect total physical and logical IO information.
- Object is not available in Buffer Cache on 2nd Instance.
- Select query – SELECT ID FROM SAMAD.TBL_CFUSION with enabling SQL Trace.
- Before and After execution of the query, Physical read is 1 on this session.
There is no physical IO for the query execution even object was not in Buffer cache. Means all required blocks for this object retrieved from 1st Instance buffer cache instead of Disk.
How Oracle RAC maintain blocks in buffer across the cluster for DML?
When a block is requested by user:
- The buffer cache is searched on local buffer cache
- If not found, there are two options
- Get from the other cache across the cluster
- Get from disk
- If found on buffer on other instance, there are three options:
- Send the buffer to the user
- Examine other caches for the presence of this buffer
- Get from the disk
- If not found, there are two options
- How does it decide which option to take?
- The buffer can be retrieved in two modes
- Consistent Read (CR) -> Block contains uncommitted changes. Session will get a version of the block prior to the changes. Intention to read block
- Current -> If intention to modify
- The buffer can be retrieved in two modes
- There can be several CR copies of a buffer
- There can be only one current mode
- For an instance
- Each current buffer is Shared Current
- Only one buffer in the entire cluster can be Exclusive Current
SQL> show user
USER is "SAMAD"
SQL> Create Table SAMAD.TBL_CFTEST(id number, name char(2000));
insert into SAMAD.TBL_CFTEST Values(1, 'Samad');
insert into SAMAD.TBL_CFTEST Values(2, 'Samad');
insert into SAMAD.TBL_CFTEST Values(3, 'Samad');
insert into SAMAD.TBL_CFTEST Values(4, 'Samad');
Table created.
SQL> SQL>
1 row created.
SQL> SQL>
1 row created.
SQL> SQL>
1 row created.
SQL> SQL>
1 row created.
SQL> commit;
Commit complete.
SQL> SELECT EXTENT_ID, BLOCK_ID, BLOCKS FROM DBA_EXTENTS
WHERE SEGMENT_NAME = 'TBL_CFTEST';
EXTENT_ID BLOCK_ID BLOCKS
---------- ---------- ----------
0 232 8
SQL> SELECT ID,
DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) FILE#,
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) BLOCK#
FROM SAMAD.TBL_CFTEST;
ID FILE# BLOCK#
---------- ---------- ----------
1 6 238
2 6 238
3 6 238
4 6 239
SQL> SELECT DATA_OBJECT_ID, OBJECT_ID FROM DBA_OBJECTS
WHERE OBJECT_NAME = 'TBL_CFTEST';
DATA_OBJECT_ID OBJECT_ID
-------------- ----------
93861 93861
For DML, rows on block status is xcur (exclusive current) and these are most recent current copy of block. xcur status for a rows is possible a copy on an instance across the cluster.
SQL> @obj_buffer.sql
Enter value for data_obj_id: 93861
old 9: where objd = &data_obj_id
new 9: where objd = 93861
FILE# BLOCK# CLASS_TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 232 1st level bmb xcur 00000000877DADA0
6 233 2nd level bmb xcur 00000000877E5958
6 234 segment header xcur 0000000087BF7C48
6 235 data block xcur 0000000087BEED00
6 236 data block xcur 00000000877D96A8
6 237 data block xcur 00000000877DBF20
6 238 data block xcur 00000000877E7C58
6 239 data block xcur 00000000877E4490
8 rows selected.
Node 2:
SQL> SELECT DATA_OBJECT_ID, OBJECT_ID FROM DBA_OBJECTS
WHERE OBJECT_NAME = 'TBL_CFTEST';
DATA_OBJECT_ID OBJECT_ID
-------------- ----------
93861 93861
TBL_CFTEST object is not available in buffer cache on local instance. When run the select command of this table,
SQL> @obj_buffer.sql
Enter value for objid: 93861
old 21: WHERE OBJD= &OBJID
new 21: WHERE OBJD= 93861
no rows selected
SQL> SELECT INSTANCE_NUMBER FROM V$INSTANCE;
INSTANCE_NUMBER
---------------
2
SQL> SELECT ID FROM SAMAD.TBL_CFTEST WHERE ID = &ID;
Enter value for id: 1
old 1: SELECT ID FROM SAMAD.TBL_CFTEST WHERE ID = &ID
new 1: SELECT ID FROM SAMAD.TBL_CFTEST WHERE ID = 1
ID
----------
1
These blocks are available in Node 1 which is resource master.
SQL> select b.dbablk, r.kjblmaster master_node
from x$le l, x$kjbl r, x$bh b
where b.le_addr = l.le_addr
and l.le_kjbl = r.kjbllockp and b.obj = &DataObjectId; 2 3 4
Enter value for dataobjectid: 93861
old 4: and l.le_kjbl = r.kjbllockp and b.obj = &DataObjectId
new 4: and l.le_kjbl = r.kjbllockp and b.obj = 93861
DBABLK MASTER_NODE
---------- -----------
238 1
235 1
237 1
234 1
239 1
236 1
6 rows selected.
Selected only one record and it should return the relevant block (block # 238) and it copied all blocks as there was no index and it reads all block for full table scan.
When oracle cache fusion request a block for reading from another instance, block copies as Consistence Read (CR) mode from other instance then cope the same block as scur mode. Oracle remain CR mode block because that point user request to read these block. SCUR (Shared Current) means these rows are up to date for across the cluster and Oracle can rely this row for any further changes in cluster.
In this point on node 2, have two copies of the blocks.
SQL> @obj_buffer.sql
Enter value for objid: 93861
old 21: WHERE OBJD= &OBJID
new 21: WHERE OBJD= 93861
FILE# BLOCK# CLASS TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 234 SEGMENT HEADER cr 00
6 234 SEGMENT HEADER scur 00000000803FA178
6 235 DATA BLOCK scur 00000000803F1230
6 235 DATA BLOCK cr 00
6 236 DATA BLOCK scur 0000000087BE01D8
6 236 DATA BLOCK cr 00
6 237 DATA BLOCK scur 0000000087BE4D50
6 237 DATA BLOCK cr 00
6 238 DATA BLOCK cr 00
6 238 DATA BLOCK scur 0000000087BE6448
6 239 DATA BLOCK cr 00
FILE# BLOCK# CLASS TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 239 DATA BLOCK scur 0000000087BE00C0
12 rows selected.
Now on Node 1, rows status changed from xcur to scur means these block’s most recent copies are available in another nodes. in this case, we saw that these blocks copies to node 2 as scur. If we have any rows status xcur means these are exclusively current across the cluster.
XCUR and SCUR both mode for a row in buffer is not possible at the same time.
SCUR on both nodes means Oracle will consider any one instance for these rows as these are up to date on both nodes.
SQL> @obj_buffer.sql
Enter value for data_obj_id: 93861
old 9: where objd = &data_obj_id
new 9: where objd = 93861
FILE# BLOCK# CLASS_TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 234 segment header scur 0000000087BF7C48
6 235 data block scur 0000000087BEED00
6 236 data block scur 00000000877D96A8
6 237 data block scur 00000000877DBF20
6 238 data block scur 00000000877E7C58
6 239 data block scur 00000000877E4490
8 rows selected.
Now try to update a row on Instance 1, and we see that row status has been changed to xcur and also remain with CR copy for read consistence.
SQL> UPDATE SAMAD.TBL_CFTEST SET NAME='Y' WHERE ID=2;
1 row updated.
SQL> @obj_buffer.sql
Enter value for data_obj_id: 93861
old 9: where objd = &data_obj_id
new 9: where objd = 93861
FILE# BLOCK# CLASS_TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 238 data block xcur 00000000877E7C58
6 238 data block cr 00
6 239 data block scur 00000000877E4490
3 rows selected.
Now on Node 2, Now we got two cr copies for block# 238 and no scur copy as we had xcur on Node 1.
SQL> @obj_buffer.sql
Enter value for objid: 93861
old 21: WHERE OBJD= &OBJID
new 21: WHERE OBJD= 93861
FILE# BLOCK# CLASS TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 238 DATA BLOCK cr 00
6 238 DATA BLOCK cr 00
6 239 DATA BLOCK cr 00
6 239 DATA BLOCK scur 0000000087BE00C0
4 rows selected.
SQL> SELECT ID,
DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) FILE#,
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) BLOCK#
FROM SAMAD.TBL_CFTEST;
ID FILE# BLOCK#
---------- ---------- ----------
1 6 238
2 6 238
3 6 238
4 6 239
Another row on same block update on Node 2, and we see 4 CR copies and 1 XCUR copy.
SQL> UPDATE SAMAD.TBL_CFTEST SET NAME='Y' WHERE ID=3;
1 row updated.
SQL> @obj_buffer.sql
Enter value for objid: 93861
old 21: WHERE OBJD= &OBJID
new 21: WHERE OBJD= 93861
FILE# BLOCK# CLASS TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 238 DATA BLOCK cr 00
6 238 DATA BLOCK cr 00
6 238 DATA BLOCK cr 00
6 238 DATA BLOCK xcur 0000000087BE6448
6 238 DATA BLOCK cr 00
6 239 DATA BLOCK scur 0000000087BE00C0
6 239 DATA BLOCK cr 00
7 rows selected.
On node 1, previously updated record (LOCK_ELEMENT_ADD-> 00000000877E7C58) status changed to Past Image (PI) mode.
SQL> @obj_buffer.sql
Enter value for data_obj_id: 93861
old 9: where objd = &data_obj_id
new 9: where objd = 93861
FILE# BLOCK# CLASS_TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 238 data block pi 00000000877E7C58
6 238 data block cr 00
6 238 data block cr 00
6 238 data block cr 00
6 239 data block scur 00000000877E4490
5 rows selected.
Once made checkpoint, PI changed to CR.
SQL> alter system checkpoint;
System altered.
SQL> @obj_buffer.sql
Enter value for data_obj_id: 93861
old 9: where objd = &data_obj_id
new 9: where objd = 93861
FILE# BLOCK# CLASS_TYPE STATUS LOCK_ELEMENT_ADD
---------- ---------- ------------------ ---------- ----------------
6 238 data block cr 00
6 238 data block cr 00
6 238 data block cr 00
6 238 data block cr 00
6 239 data block scur 00000000877E4490
5 rows selected.
If we compare IO statistic, we see that more physical IO on Node 1 than Node 2. And “gc cr blocks received” value is 8.
Node 1: IO details
SQL> SELECT statistic_name, value
FROM v$segstat
WHERE dataobj# = &data_object_id and value > 0;
Enter value for data_object_id: 93861
old 3: WHERE dataobj# = &data_object_id and value > 0
new 3: WHERE dataobj# = 93861 and value > 0
STATISTIC_NAME VALUE
---------------------------------------------------------------- ----------
logical reads 80
db block changes 32
physical writes 8
physical write requests 4
space used 9257
space allocated 65536
6 rows selected.
Node 2: IO details
SQL> SELECT statistic_name, value
FROM v$segstat
WHERE dataobj# = &data_object_id and value > 0;
Enter value for data_object_id: 93861
old 3: WHERE dataobj# = &data_object_id and value > 0
new 3: WHERE dataobj# = 93861 and value > 0
STATISTIC_NAME VALUE
---------------------------------------------------------------- ----------
logical reads 48
db block changes 16
physical reads 6
physical writes 1
physical read requests 2
physical write requests 1
gc cr blocks received 8
gc current blocks received 1
8 rows selected.
Used Scripts:
[oracle@ocmnode2 ~]$ cat obj_buffer.sql
SELECT FILE#, BLOCK#, DECODE(CLASS#,
1, 'DATA BLOCK',
2, 'SORT BLOCK',
3, 'SAVE UNDO BLOCK',
4, 'SEGMENT HEADER',
5, 'SAVE UNDO HEADER',
6, 'FREE LIST',
7, 'EXTENT MAP',
8, '1ST LEVEL BMB',
9, '2ND LEVEL BMB',
10, '3RD LEVEL BMB',
11, 'BITMAP BLOCK',
12, 'BITMAP INDEX BLOCK',
13, 'FILE HEADER BLOCK',
14, 'UNUSED',
15, 'SYSTEM UNDO HEADER',
16, 'SYSTEM UNDO BLOCK',
17, 'UNDO HEADER',
18, 'UNDO BLOCK') "CLASS TYPE", STATUS, LOCK_ELEMENT_ADDR
FROM V$BH
WHERE OBJD= &OBJID
ORDER BY 1,2,3
/
[oracle@ocmnode2 ~]$ cat obj_stat.sql
SELECT statistic_name, value
FROM v$segstat
WHERE dataobj# = &data_object_id and value > 0
/
select s.name, st.value
from v$statname s, v$mystat st
where st.STATISTIC# = s.STATISTIC#
and s.name in ('session logical reads','physical reads');