Disclaimer

Monday 16 November 2020

Bitmap Index vs B-tree Index creation scenario in Oracle

Scenario 

************

1)
Create table priya.test_normal (empno number(10), ename varchar2(30), sal number(10));

2)
Begin
For i in 1..1000000
Loop
   Insert into priya.test_normal 
   values(i, dbms_random.string('U',30), dbms_random.value(1000,7000));
   If mod(i, 10000) = 0 then
   Commit;
  End if;
End loop;
End;
/

3)
Create table priya.test_random 
as 
select /*+ append */ * from priya.test_normal order by dbms_random.random;


4)
select count(*) "Total Rows" from priya.test_normal;

5)
select count(distinct empno) "Distinct Values" from priya.test_normal;

6)
select count(*) "Total Rows" from priya.test_random;


7)
select count(distinct empno) "Distinct Values" from priya.test_random;



Note that the TEST_NORMAL table is organized and that the TEST_RANDOM table is randomly created 
and hence has disorganized data. 
In the above table, column EMPNO has 100-percent distinct values and is a good candidate to become a primary key. 
If you define this column as a primary key, you will create a B-tree index 
and not a bitmap index because Oracle does not support bitmap primary key indexes.



To analyze the behavior of these indexes, we will perform the following steps:

*****************************************************************

Before creating BITMAP index on TEST_NORMAL table explain plan
--------------------------------------------------------
SQL> explain plan for select * from priya.test_normal where empno=1000;
Explained.
SQL> @?/rdbms/admin/utlxpls.sql
PLAN_TABLE_OUTPUT
-------------------------------------------
Plan hash value: 512490529
---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |    68 |  2924 |  1697   (1)| 00:00:21 |
|*  1 |  TABLE ACCESS FULL| TEST_NORMAL |    68 |  2924 |  1697   (1)| 00:00:21 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("EMPNO"=1000)
Note
-----
   - dynamic sampling used for this statement (level=2)
17 rows selected.




SQL> set autotrace on
SQL>
SQL> select * from priya.test_normal where empno=1000;
     EMPNO ENAME                                 SAL
---------- ------------------------------ ----------
      1000 OWKHHLEOZCECWYNKNHBTQNKLCHJSDL       6280
1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 512490529
---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |    68 |  2924 |  1697   (1)| 00:00:21 |
|*  1 |  TABLE ACCESS FULL| TEST_NORMAL |    68 |  2924 |  1697   (1)| 00:00:21 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("EMPNO"=1000)
Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       6218  consistent gets
       6213  physical reads
          0  redo size
        570  bytes sent via SQL*Net to client
        415  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed



After creating BITMAP index on TEST_NORMAL table explain plan
----------------------------------------------------------------------

10:35:02 SQL> create bitmap index normal_empno_bmx on priya.test_normal(empno);
Index created.
Elapsed: 00:00:55.04


10:36:13 SQL> analyze table priya.test_normal compute statistics for table for all indexes for all indexed columns;
Table analyzed.
Elapsed: 00:00:14.95
10:38:23 SQL>

10:47:38 SQL> select sum(bytes/1024/1024) segment_name from dba_segments where segment_name='TEST_NORMAL' and owner='PRIYA';
SEGMENT_NAME
------------
          49
Elapsed: 00:00:00.39

SQL> select substr(segment_name,1,30) segment_name, bytes/1024/1024 "Size in MB"
   from user_segments where segment_name in ('TEST_NORMAL','NORMAL_EMPNO_BMX');  2
SEGMENT_NAME                   Size in MB
------------------------------ ----------
NORMAL_EMPNO_BMX                       28
TEST_NORMAL                            49

SQL> select index_name, clustering_factor from user_indexes;
NORMAL_EMPNO_BMX                    1000000


You can see in the preceding table that the size of the index is 28MB and that the clustering factor is 
equal to the number of rows in the table. Now let's execute the queries with equality predicates for 
different sets of values:


SQL> set autotrace on
SQL> select * from test_normal where empno=&empno;
Enter value for empno: 1000
old   1: select * from test_normal where empno=&empno
new   1: select * from test_normal where empno=1000
     EMPNO ENAME                                 SAL
---------- ------------------------------ ----------
      1000 OWKHHLEOZCECWYNKNHBTQNKLCHJSDL       6280

Execution Plan
----------------------------------------------------------
Plan hash value: 4267925254
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    34 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | TEST_NORMAL      |     1 |    34 |     3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | NORMAL_EMPNO_BMX |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("EMPNO"=1000)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        574  bytes sent via SQL*Net to client
        415  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


************************************************************************

SQL> explain plan for
  2  select * from test_normal where empno=100;
Explained.
SQL>
SQL> @?/rdbms/admin/utlxpls.sql
PLAN_TABLE_OUTPUT
----------------------------------------------------------------
Plan hash value: 4267925254
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    34 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | TEST_NORMAL      |     1 |    34 |     3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | NORMAL_EMPNO_BMX |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("EMPNO"=100)
15 rows selected.


Now creating B-Tree index scenario:
---------------------------------------------------------------------------
Step 1B (on TEST_NORMAL)

Now we will drop this bitmap index and create a B-tree index on the EMPNO column. 
As before, 
we will check for the size of the index and its clustering factor and 
execute the same queries for the same set of values, to compare the I/Os.


SQL>
SQL> drop index NORMAL_EMPNO_BMX;
Index dropped.
SQL> create index normal_empno_idx on test_normal(empno);
Index created.
SQL> analyze table test_normal compute statistics for table for all indexes for all indexed columns;

Table analyzed.
SQL> SQL>
SQL>
SQL> select substr(segment_name,1,30) segment_name, bytes/1024/1024 "Size in MB"
   from user_segments
    where segment_name in ('TEST_NORMAL','NORMAL_EMPNO_IDX');
  2    3
SEGMENT_NAME                   Size in MB
------------------------------ ----------
NORMAL_EMPNO_IDX                       18
TEST_NORMAL                            49

SQL>
SQL>
SQL> select index_name, clustering_factor from user_indexes;
INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
PK_EMP                                         1
PK_DEPT                                        1
CUSTOMER_PK                                    1
IDX_TESTIDX_ID                                 1
PK_TEST                                       93
NORMAL_EMPNO_IDX                            6210
CUST_IDX1                                      1
FK_SUPPLIER                                    0
SUPPLIER_PK                                    1
SALES_PK                                       1
SYS_C003596                                    1
11 rows selected.


It is clear in this table that the B-tree index is smaller than the bitmap index on the EMPNO column. 
The clustering factor of the B-tree index is much nearer to the number of blocks in a table; for that reason, 
the B-tree index is efficient for range predicate queries.

Now we'll run the same queries for the same set of values, using our B-tree index.

SQL> set autotrace on
SQL> select * from test_normal where empno=&empno;
Enter value for empno: 100
old   1: select * from test_normal where empno=&empno
new   1: select * from test_normal where empno=100
     EMPNO ENAME                                 SAL
---------- ------------------------------ ----------
       100 TGIREMWWAZPHEIVUEWSDKMPHECHQXQ       5455

Execution Plan
----------------------------------------------------------
Plan hash value: 1781697849
------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |     1 |    34 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TEST_NORMAL      |     1 |    34 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | NORMAL_EMPNO_IDX |     1 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMPNO"=100)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          2  physical reads
          0  redo size
        574  bytes sent via SQL*Net to client
        415  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed




SQL> @?/rdbms/admin/utlxpls.sql
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1781697849
------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |     1 |    34 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TEST_NORMAL      |     1 |    34 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | NORMAL_EMPNO_IDX |     1 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMPNO"=100)
14 rows selected.



************************************************

SQL> select index_name, num_rows, leaf_blocks from dba_indexes where index_name ='NORMAL_EMPNO_IDX';
INDEX_NAME                       NUM_ROWS LEAF_BLOCKS
------------------------------ ---------- -----------
NORMAL_EMPNO_IDX                  1000000        2226


The CLUSTERING_FACTOR column in the USER_INDEXES view gives an indication as to how organized the data is 
compared to the indexed columns. If the value of the CLUSTERING_FACTOR 
column value is close to the number of leaf blocks in the index, the data is well ordered in the table.

If the value is not close to the number of leaf blocks in the index, then the data in the table is not well ordered. 
The leaf blocks 
of an index store the indexed values as well as the ROWIDs to which they point.


SQL> select index_name, num_rows,CLUSTERING_FACTOR,leaf_blocks from dba_indexes where index_name ='NORMAL_EMPNO_IDX';
INDEX_NAME                       NUM_ROWS CLUSTERING_FACTOR LEAF_BLOCKS
------------------------------ ---------- ----------------- -----------
NORMAL_EMPNO_IDX                  1000000              6210        2226









No comments:

Post a Comment

100 Oracle DBA Interview Questions and Answers

  Here are 100 tricky interview questions tailored for a Senior Oracle DBA role. These questions span a wide range of topics, including perf...