Disclaimer

Sunday, 5 September 2021

Reading Explain Plan in smarter way: Access vs Filter

 

Reading Explain Plan in smarter way: Access vs Filter





Today I will try to explain “Predicate Information” section of query explain plan. We know that explain plan can be used (or many times used as a primary way) to tune SQL queries. So today, we will discuss very important part of it, “Predicate Information”. Especially, what is Filter, what is Access, which is better and how to use that information in order to tune our slow running query.

Before we start, let’s brush up our knowledge about Query Execution Plan.
Whenever I use explain plan to tune queries I mainly focus on below 2 portions of it (3rd and 4th portion is also important but let’s keep those aside for this post):

1. PLAN Table:
 that cryptic and kind of intimidating thing in explain plan. This table in-general should have below columns:

a. Id: It’s a line number for data present in plan table. Just keep in mind, whenever there’s an asterisk ‘*’ in front of any line, there will be extra information about that line in Predicate Information section.

b. Operation: It is a step in a query execution plan. It is an action which cost based optimizer will take in order to produce result of our query. It can be full table scan, index range scan, hash join etc.

c. Name: It’s a name of an object which CBO will access during executing the query. It can be the name of a table, view, index etc.

d. Rows: Row (or cardinality) is number of rows which CBO thinks that step will return. E.g. If a line in explain plan has operation as full table scan, name as Employees and rows as 40 then we can say that CBO will access Employees table (why table? because operation says full “table” scan) as a full table scan method and it thinks that it will get 40 rows after doing this.

e. Bytes: Amount of data in bytes (data volume in bytes, the we have Kb, MB, GB etc.)

f. Cost: We can say cost is Oracle’s unit to measure efforts required in any step. Cost is a very complex thing to understand but we can say it’s sum of efforts (like IO. network, CPU etc.) CBO thinks it will have to put in particular step. Best way to use cost to our advantage is: lesser the number, better it is ðŸ˜‰

g. Temp: Sometime we see Temp column in our explain plan, it’s a red signalWe want our query’s all operations (like sorting, grouping, joins etc.) to get complete in memory (i..e. in PGA) only but whenever PGA size is less (or intermediate data is more) CBO thinks it will have to use Temp tablespace, i.e. IO, which is not good as IO devices are generally slower.

2. Predicate Information section: It gives additional information about steps marked with asterisk in plan table. Information of this section generally starts with line number, followed by either Access or Filter.

Ohh, how can I forget main thing: Explain plan is read inside out i.e. innermost line (or operation) as per indentation gets executed first (didn’t understand? no worries, we have that example in this post, chill!!!!).

I use exps script to generate query’s explain plan (which you can download from here), It’s nothing but below statement:
select * from table(dbms_xplan.display(‘PLAN_TABLE’,NULL,’ADVANCED +peeked_binds -projection -outline -alias’));

Let’s talk in-depth about Predicate Information section before we execute test cases.


Whenever we see Access in this section, it tells us that, CBO can directly get those rows with the help of index, it won’t have to perform any useless efforts during this process. Access simply means, CBO will directly get required rows with the help of index (i.e. no waste).
Whenever we see Filter in this section, it tells us that, while accessing rows, CBO will have to put some unwanted efforts i.e. it can’t directly go to desired rows. CBO will first have to select bunch of rows (either through index or through table scan) and then it will have to throw away unwanted rows in order to get rows of interest (full of waste).

Best thing to say about Access vs Filter is: Access operation is better than Filter. We should always check that we will get access operation (or try to convert access into filter, I will show that in next few steps).

Long story short: We should avoid filter operation (or there’s an opportunity to make query faster whenever we see filter operation).

Let’s create a table for exercise purpose:

SQL> @date

SYSDATE
-------------------
2018-11-06 12:10:51

SQL> create table amol_dba_objects as select * from dba_objects;

Table created.

SQL> select count(*) from amol_dba_objects;

COUNT(*)
----------
91627

(Remember I’m not using dynamic_sampling nor stats for tables are updated, we will discuss about these topics in future posts. These things are not that much relative in this post)

Case 1: No predicate information section
As you can see, there’s no “where” clause in the query so we are not seeing “predicate information” section in explain plan.
Just to add, no matter how many indexes we have on this table, we won’t see index access in query explain plan as there’s no “where” clause, makes sense right?


explain_plan_without_predicate_information

(By the way, that @epf contains “explain plan for “, in another words you can say Amol is lazy)


Case 2: Explain plan with predicate information section

Now, I have added “where” clause in my query.
Please note below things about this explain plan:
– There’s an asterisk in from of Id 1, which suggests there will be additional information about this step in predicate information section.
– Predicate information section is visible now.
– In predicate information section, we are seeing “filter” operation along with actual condition of the query.

explain_plan_with_predicate_information.png

As I explained earlier, filter operation means, CBO will fetch unwanted rows (either from index,if it’s index scan or from table, if its full table scan). Which is a type of waste/useless efforts because in this case, CBO is taking all rows of a table and then filtering out rows which are not matching the condition “owner=HR”  meaning CBO is first requesting required+non-required rows and then removing non-required rows. What I mean to say here is: whenever CBO asks for extra rows than required, it has to filter out that waste and it results into slowness of a query. Why CBO gets non-desired rows???as it doesn’t have index to go to exact rows, so we should try to turn filter into access.
Way to turn filter operation into access is: creating index (or optimal index).

But remember, creating index is one time activity, it will (or may) improve performance of your (few) select (or DML containing select) statements but will degrade performance of almost every DML (insert, update, delete) operations on that table, which could be a disaster for transaction processing systems. So I always try to avoid creating new index (or use it as a very last resort) but for demonstration purpose we will create index on this table (as it’s an only way to convert filter to access).

So let’s create an index on column referred in “where” condition:







Now, let’s take explain plan for same query and analyze the results:
explain_plan_with_index.png
We can see:
– Our newly created index “AMOL_DBA_OBJECTS_I1” is getting picked up by CBO.
– Filter condition has turned into Access condition. That was the reason behind index creation.
– Cost of the query has reduced from 429 to 98!!!!!

Now we can say, CBO is directly requesting (or going at) the rowids which are required with the help of index. As CBO is directly going at required rowids (thanks to newly created index), after index access there’s table access and no wastage of efforts so no rows filtering. This results into faster query performance.


Case 3: Predicate information section with both Access and Filter:

As you are aware, we have created an index on “owner” column of table AMOL_DBA_OBJECTS, so whenever owner column comes in query (in “where” clause), our index should (yes index may or may not get used, why so? I will write a post about it afterwards) get used. So let’s assume it will get used for ow. But what about a column which is not there in index but which is referred in where clause? Will it come under Access or Filter?
explain_plan_with_access_and_filter.png

As we have added additional condition (object_type=’INDEX’) in query, we can see filter criteria in predicate information and also 2 asterisks, 1 in-front of index and other in-front of index. In this case below things are happening:
CBO will access index first (why index first? because index range scan step is the innermost step in the explain plan). After accessing the index, CBO will get bunch of rowids but those rowids are pointing to rows which are not satisfying criteria object_type=’INDEX’ (i.e. that result set contains rows which are not satisfying this condition, some impurities). So from index CBO will go to table and there CBO will get rows then CBO will filter those rows with condition object_type=’INDEX’ to produce final result.

PFB pictorial representation for the same:

Access+Filter_pictorial_representation.png

So, how to resolve this? Create a new index? or add column in newly created index? or keep this query as it is?
I will answer these questions in my next post.
Till then, take care, happy tuning guys ðŸ™‚



No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...