Performance Tuning Levels:
Application Tuning:
- Before diving into Oracle-specific tuning, it’s crucial to check the application code. If the code is poorly written or inefficient, tuning the database will not help much.
- Optimizing the application logic might be the first step, such as optimizing loops, algorithms, or how data is processed and fetched.
Process Tuning:
- This refers to tuning various processes at the operating system or database level. It could involve adjusting memory allocation, CPU usage, and managing background processes efficiently.
- In Oracle, this may include configuring the instance's initialization parameters, managing session memory, and tuning background processes.
SQL Query Tuning:
- This focuses on optimizing individual SQL queries to improve performance.
- Once you've ruled out application code issues, Oracle-specific tuning (such as SQL query optimization) is the next step.
Common SQL Query Pitfalls and How to Avoid Them:
Using NOT EQUAL TO (
<>
):- Issue: The
<>
operator prevents Oracle from using indexes efficiently, as Oracle needs to scan more rows to find the non-matching ones. - Solution: Try to replace
<>
with more optimized comparison operators like=
,<
, or>
, whenever possible. These operators help the database engine utilize indexes more effectively.- Example: Instead of
a.emp_type <> 'P'
, you can rewrite it asa.emp_type = 'T'
if you are looking for specific values that are not equal to 'P'.
- Example: Instead of
- Issue: The
Large IN Lists:
- Issue: Using a large
IN
list in a query can prevent index usage, especially if the list contains a large number of values. - Solution: If possible, consider storing the list in a temporary table and using an
EXISTS
clause instead. This can significantly improve query performance, asEXISTS
allows Oracle to stop searching once it finds a match.- Example: Instead of
WHERE a.emp_id IN (1, 2, 3, ..., N)
, store those values in a temporary table and rewrite the query toWHERE EXISTS (SELECT 1 FROM temp_table t WHERE t.emp_id = a.emp_id)
.
- Example: Instead of
- Issue: Using a large
Using Functions in WHERE Clauses (e.g.,
SUBSTRING
):- Issue: Using functions in the
WHERE
clause (likeSUBSTRING
,TO_CHAR
,UPPER
, etc.) can prevent Oracle from using indexes. This forces a full table scan. - Solution: Avoid using functions in
WHERE
clauses. If you need to filter data based on a function, see if you can preprocess the data or store it in a computed column for better performance.- Example: Instead of
WHERE SUBSTRING(a.emp_name, 1, 3) = 'ABC'
, consider having an indexed column that stores a truncated version ofemp_name
or filter in a different way.
- Example: Instead of
- Issue: Using functions in the
Using UNION Instead of UNION ALL:
- Issue: The
UNION
operator performs an internalDISTINCT
operation to eliminate duplicate rows, which can be expensive, especially with large datasets. - Solution: If you don’t need to eliminate duplicates, use
UNION ALL
instead. It performs much faster because it doesn’t involve the overhead of sorting and deduplication. - Example: Replace
SELECT * FROM table1 UNION SELECT * FROM table2
withSELECT * FROM table1 UNION ALL SELECT * FROM table2
if you’re okay with duplicates.
When to Create a Small Table for IN List:
- If you have a query with a large
IN
list and you cannot avoid it (i.e., the values in the list are dynamic or come from an external source), creating a small table to store those values can be a very efficient solution. - Steps:
- Create a small lookup table to store the values in the
IN
list. - Rewrite your query using
EXISTS
to reference the temporary table, which allows Oracle to use more efficient operations, like a hash join, instead of a linear scan.
- Create a small lookup table to store the values in the
Example:
Then, modify your query:
This avoids scanning a large list in the IN
clause and makes the query more efficient.
Conclusion:
Optimizing SQL queries requires careful attention to common pitfalls, such as inefficient use of operators, functions, or set operations.
By understanding these best practices and rethinking your query structure, you can dramatically improve the performance of your database queries, reducing the load on Oracle and enhancing the overall application experience.
Book-Author example
Consider the two tables above: the book and the author tables.
If the information had to be printed the book table would print on
ten pages, and the author table on three.
In order to tune
queries involving these tables, it’s important to imaging how you would search
for the information if you were given the table contents printed on paper.
Consider the query:
“I need all the author
names who have authored reference books”
We can either start at the book table, or at the author table.
If we start with the author table, we can lookup the first author ID, and then look through the entire book table if he has any reference books.
The other way is to start with the book table, and when we find a reference book we can check the author name against that.
Which one is faster? The second – starting with the book
table? Why – because the author table is order by Author
ID.
Now let’s
have a look at how Oracle interprets indexes.
If we had an
index on the Book table, on the fields Author ID and Book ID, this
is how it would logically look. It can be used if we know both the Book ID and the
Author ID (have
both the fields in the where clause of the query). It can also be used if we
know just the Author
ID, but will be slightly slower (see RANGE SCAN below). However, it cannot be used if we
know just the Book
ID.
Before executing a query, Oracle creates a plan - the order in which it will
search the tables, the indexes it will use, the way it will use the indexes
etc. Step 1 in tuning is to check the plan for the SQL in question:
In order
to generate the plan, on SQL Plus use:
set autotrace on;
or if we want to just see the trace without
executing the query use:
set
autotrace traceonly explain;
Now looking at the explain plan shown above, what does it mean? The
point to note is that explain plan is actually executed from more indented to
less indented (inside to outside) and from bottom to top.
Table
access full means that the complete table needs to
be searched in order to find the information. Index range scan means that index will be used, however the filters
provided in the WHERE clause may not be ‘complete’.
Index full scan means the entire index will be searched.
Table access by index rowid means the rows are being located using an index.
Hash, Nested loops and Merge are all different types of joins. These are the most commonly seen operations in an explain plan.
Now that we know how Oracle is executing our query, the next step is
to influence the way Oracle runs the query. How to do that?
=> By creating
indexes and optimizing the join order of tables
=> By creating statistics
=> By adding hints to the query
I
have already discussed indexes. An example of how to create an index:
CREATE INDEX book_author_index ON book (book_id, author_id);
Statistics refers to the information that Oracle can store about tables and fields on your request that will help it to make more informed decisions regarding the plan.
For example, let us say that purchase order table P has a field STATUS which can be ‘O’ (Open) or ‘C’ (Close).
If you give Oracle a query with this field in the WHERE clause – it will assume that O & C values are equally distributed: 50% each. In truth this may not be the case: O values maybe present only in 10% of the rows.
Such information being calculated offline, stored and used during
plan making is called statistics.
To create statistics use something like:
EXEC DBMS_STATS.gather_table_stats('SYSADM', 'BOOK');
A hint is written
as part of the query, but doesn’t
affect the output.
It only affects the plan.
For example:
SELECT /*+ INDEX(A IDX_PURCH_ORD) */ * FROM PURCH_ORD A
WHERE STATUS=’O’
The hint is written within comments, but with a plus sign to
indicate that it’s a hint, not a comment. This
hint tells Oracle to use the index IDX_PURCH_ORD to lookup the
table PURCH_ORD. Most of the time Oracle
automatically uses the correct index, however is certain ‘tough’ situations it
needs help.
One of the most useful hints that I have found is the LEADING hint. This hint tells
Oracle to start with a particular table from the tables used in the query.
For example:
SELECT /*+ LEADING(i) */ po_id, status FROM PURCH_ORD p, INVOICE i
WHERE invoice=’123456’ AND i.po_id=p.po_id
In this query,
the LEADING hint tells Oracle to start with the INVOICE table, find the PO_ID for the invoice, and use that to
lookup the STATUS in the p table.
Now coming to process tuning.
We already know how to tune an SQL query.
The problem that remains now is to find out which query is doing badly, from among all the queries being executed by the process.
If the process is small / simple – a simple inspection of the code will give you an idea of the problem SQL.
In a slightly larger SQL you may be able to add some code to display the current time before and after each query.
However, if that is not possible, or the SQL is very large – we need some automated tracing to find the SQL.
One good way to do this is through the 10046 trace.
To perform this trace the process should execute the following SQL statement as the first step:
ALTER SESSION SET EVENTS '10046 trace name context forever, level 12'
Once the process completes, the DBA should be able to check the trace output, and run it through tkprof to give you an HTML report.
This report gives you a list of the worst queries, the corresponding plans and the waiting times for those queries.
It’s a wealth of information to pinpoint the problem
SQLs and start tuning them.
If the DBA gives you a trace file (from the udump folder on the
database server) and not a report,
you can use the trace file to generate the report yourself as below:
tkprof
<trace_file> <output_file> sort=fchela
A quick cheatsheet:
*************************
When you get a query to tune, what are the things we can quickly look at to get going?
Here is a quick cheatsheet:
Try to limit the output.
*****************************
For example, if we want a list of open orders, we may not be interested in orders from more than 2 years back.
Hence, if the order_date is an indexed column,
we can add:
order_date >= SYSDATE-700
=> Have a look at the indexes being used. The index usage may not be optimal.
=> Have a look at the join order of the tables.
As a rule, Oracle finds it hard to come up with a good join order where the number of tables is 4 or more.
Use the LEADING hint if needed, or the USE_NL hint.
=> If the query uses multiple business units such as:
business_unit IN (‘NL100’,’NL200’,’NL300’)
it may be faster to run the query separately, once for each business unit.
If it must run as a single query, explore connecting the three separate queries through UNION ALL
=> If its being run for multiple FISCAL_YEARs, the same applies as business unit above
=> It we are joining the tables (or self join), it is important to use all the keys in the join.
For example if a table t1 has three keys:
t1.key1=t2.key1 AND t1.key2=t2.key2
This may solve your purpose for the moment
(the two keys may be unique in themselves given the scenario), but it will be inefficient.
Additional suggestions:
-
Start out with
low hanging fruit
-
Go after
the FULL SCANs
If you have a system that doesn’t perform very well, and the previous step has already been completed, then go after the FULL SCANs.
Pick up queries that do FULL SCAN one by one, and tune them.
The timing different may not be much, even after removing the FULL SCAN, but the overall impact on the system will be huge, due to the much reduced IO operations.
The
disk is the slowest part of the chain, remember. To check the FULL SCANS going
on now, run the SQL
SELECT target, a.sid, time_remaining, b.username, b.program, opname FROM v$session_longops a, active b WHERE time_remaining>120 AND a.sid=b.sid(+)
(you may need to request access to these tables)
-
Reschedule activity
This is a workaround in cases where performance of the query itself
is difficult to improve. If a job with a bad impact on performance can wait
till the weekend, better go that route. If it can run during
a time when the system is underutilized go for it. If two IO intensive jobs can run sequentially rather
than parallely go for it.
On those same lines, if there are two jobs that run almost
simultaneously, it may be prudent to move one of them five minutes above or
below the other.
-
Deadlocks
Deadlocks are the last bane of performance tuning in a production
system. Oracle takes some time in detecting deadlocks, and killing the
offending process. During this timeframe your database queue can spiral.
These are quite easy to fix: the offending queries are present in the database log. Hence, pick up the queries, search the processes to find where these are running and redesign suitably. Deadlock will go away if the processes do not try to process the same set of rows at the same time.
No comments:
Post a Comment