Disclaimer

Sunday, 4 May 2025

SQL Query tuning in Oracle database

 

Here are practical SQL query tuning and rewriting examples for Oracle Database. These cover common small improvements that can significantly boost performance.



🔸 1. Avoid SELECT *

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better:


SELECT first_name, last_name, salary FROM employees WHERE department_id = 10;

Reduces I/O and memory usage by fetching only needed columns.




🔸 2. Use EXISTS Instead of IN (for subqueries)

Bad:


SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = 1700);

Better:


SELECT * FROM employees e WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id = e.department_id AND d.location_id = 1700);

EXISTS stops at first match, IN evaluates all rows.




🔸 3. Use JOIN Instead of Subqueries (where appropriate)

Bad:

SELECT employee_id, (SELECT department_name FROM departments d WHERE d.department_id = e.department_id)
FROM employees e;


Better:


SELECT e.employee_id, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id;

Joins are usually faster and optimizer-friendly.




🔸 4. Use BIND Variables

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better (PL/SQL or apps):


SELECT * FROM employees WHERE department_id = :dept_id;

Enables SQL reuse and avoids hard parsing.




🔸 5. Create Indexes on Filter Columns


CREATE INDEX emp_dept_idx ON employees(department_id);

Improves performance for queries filtering on department_id.




🔸 6. Use ANALYZE or DBMS_STATS to Keep Stats Updated


EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES');

Helps optimizer choose the best plan.




🔸 7. Use WHERE Clause to Limit Rows

Bad:


SELECT first_name FROM employees;


SELECT first_name FROM employees WHERE department_id = 10;

Reduces full table scans.




🔸 8. Avoid Functions on Indexed Columns

Bad:


SELECT * FROM employees WHERE TO_CHAR(hire_date, 'YYYY') = '2020';

Better:


SELECT * FROM employees WHERE hire_date BETWEEN TO_DATE('2020-01-01','YYYY-MM-DD') AND TO_DATE('2020-12-31','YYYY-MM-DD');

Allows use of index on hire_date.




🔸 9. Use UNION ALL Instead of UNION (if duplicates don’t matter)


SELECT first_name FROM employees UNION SELECT first_name FROM contractors;

Better:

SELECT first_name FROM employees
UNION ALL SELECT first_name FROM contractors;

UNION sorts and removes duplicates, UNION ALL is faster.




🔸 10. Use WITH Clause for Reused Subqueries


WITH dept_data AS ( SELECT department_id FROM departments WHERE location_id = 1700 ) SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM dept_data);

Improves readability and can boost performance for complex queries.






🔹 11. Avoid Starting with Wildcard in LIKE

Bad (causes full table scan):


SELECT * FROM employees WHERE first_name LIKE '%john';

Better:


SELECT * FROM employees WHERE first_name LIKE 'john%';

Index can be used if the pattern doesn’t start with %.




🔹 12. Use IN for Static Filters Instead of Multiple ORs

Bad:


SELECT * FROM employees WHERE department_id = 10 OR department_id = 20 OR department_id = 30;

Better:


SELECT * FROM employees WHERE department_id IN (10, 20, 30);

Simplifies plan and improves performance.




🔹 13. Avoid Sorting Unless Needed (Reduce ORDER BY)

Bad:


SELECT * FROM employees ORDER BY hire_date;

Better (only if sorting is not required):


SELECT * FROM employees;

ORDER BY requires sorting and extra CPU/Temp space.




🔹 14. Avoid Full Table Scans by Using Indexes

Bad (no filter):

SELECT * FROM orders;

Better:


SELECT * FROM orders WHERE order_status = 'OPEN';

Use filters to take advantage of indexes and reduce I/O.




🔹 15. Use GROUP BY with Selective Filtering

Bad:

SELECT department_id, COUNT(*) FROM employees GROUP BY department_id;

Better:

SELECT department_id, COUNT(*)
FROM employees WHERE status = 'ACTIVE' GROUP BY department_id;

Filtering before grouping reduces rows and resource usage.





🔹 16. Avoid Functions on Indexed Columns (Again)

Bad:

SELECT * FROM employees WHERE UPPER(last_name) = 'SMITH';

Better:

  • Create a function-based index:


CREATE INDEX emp_lastname_upper_idx ON employees(UPPER(last_name));

Then:

SELECT * FROM employees WHERE UPPER(last_name) = 'SMITH';

Allows index usage on derived column.




🔹 17. Use Inline Views to Pre-Filter


SELECT dept_name, emp_count FROM ( SELECT d.department_name AS dept_name, COUNT(e.employee_id) AS emp_count FROM employees e JOIN departments d ON e.department_id = d.department_id WHERE e.status = 'ACTIVE' GROUP BY d.department_name ) WHERE emp_count > 5;

Reduces processing in outer query.




🔹18. Use Optimizer Hints Where Needed


SELECT /*+ INDEX(employees emp_status_idx) */ * FROM employees WHERE status = 'ACTIVE';

Forces use of desired index when optimizer chooses wrong path.



🔹 19. Avoid NOT IN with NULLs

Bad:

SELECT * FROM employees WHERE department_id NOT IN (10, 20, NULL);

Better:

SELECT * FROM employees WHERE department_id NOT IN (10, 20);

NULL in NOT IN returns no rows due to unknown comparison.




🔹 20. Use Analytical Functions Instead of Correlated Subqueries

Bad:

SELECT e.*,
(SELECT MAX(salary) FROM employees WHERE department_id = e.department_id) AS max_sal FROM employees e;

Better:

SELECT e.*, MAX(salary) OVER (PARTITION BY department_id) AS max_sal
FROM employees e;

Reduces repetitive subquery execution using analytic window functions.




🔹 21. Use MERGE Instead of Separate INSERT and UPDATE


MERGE INTO target_table t USING source_table s ON (t.id = s.id) WHEN MATCHED THEN UPDATE SET t.name = s.name WHEN NOT MATCHED THEN INSERT (id, name) VALUES (s.id, s.name);

Efficient for upserts.




🔹 22. Use FETCH FIRST Instead of ROWNUM (12c+)

Old way:


SELECT * FROM employees WHERE department_id = 10 AND ROWNUM <= 5;

New way:

SELECT * FROM employees WHERE department_id = 10 FETCH FIRST 5 ROWS ONLY;

More readable and optimizer-aware.




🔹 23. Avoid DISTINCT Unless Needed

Bad:

SELECT DISTINCT department_id FROM employees;

Better:

SELECT department_id FROM employees GROUP BY department_id;

GROUP BY can be more efficient in certain queries.





🔸 24. Avoid SELECT *, Specify Columns

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better:


SELECT first_name, last_name FROM employees WHERE department_id = 10;

Reduces I/O by fetching only needed columns.




🔸 25. Join on Indexed Columns

Bad (no index):


SELECT * FROM orders o JOIN customers c ON o.customer_name = c.customer_name;

Better:


SELECT * FROM orders o JOIN customers c ON o.customer_id = c.customer_id;

Use join keys with indexes for faster lookups.




🔸 26. Push Filters Inside Views

Bad:


SELECT * FROM (SELECT * FROM employees) WHERE department_id = 10;

Better:


SELECT * FROM (SELECT * FROM employees WHERE department_id = 10);

Push predicates down into views or inline subqueries.



🔸 27. Use EXISTS Instead of IN for Subqueries

Bad:


SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM departments);

Better:


SELECT * FROM employees e WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id = e.department_id);

EXISTS stops scanning once a match is found, unlike IN.




🔸 28. Avoid Scalar Subqueries in SELECT

Bad:

SELECT e.name,
(SELECT COUNT(*) FROM tasks t WHERE t.employee_id = e.id) AS task_count FROM employees e;

Better:


SELECT e.name, COUNT(t.task_id) AS task_count FROM employees e LEFT JOIN tasks t ON e.id = t.employee_id GROUP BY e.name;

Avoids repeated subquery execution for each row.




🔸 29. Use WITH Clause for Repeated Subqueries

Without WITH:


SELECT COUNT(*) FROM (SELECT * FROM orders WHERE status = 'OPEN'); SELECT MAX(order_date) FROM (SELECT * FROM orders WHERE status = 'OPEN');

Better (WITH clause):


WITH open_orders AS ( SELECT * FROM orders WHERE status = 'OPEN' ) SELECT COUNT(*) FROM open_orders; SELECT MAX(order_date) FROM open_orders;

Prevents re-evaluation of subquery.




🔸 30. Use Bitmap Indexes for Low Cardinality Columns

If status has few values:


CREATE BITMAP INDEX emp_status_bix ON employees(status);

Bitmap indexes work well for values like 'ACTIVE', 'INACTIVE'.




🔸 31. Avoid Implicit Data Type Conversion

Bad:

SELECT * FROM employees WHERE employee_id = '100'; -- employee_id is NUMBER

Better:


SELECT * FROM employees WHERE employee_id = 100;

Avoids implicit conversion and allows index usage.




🔸 32. Use Partition Pruning

If sales is partitioned by month:


SELECT * FROM sales WHERE sale_month = 'JAN-2024';

Ensures Oracle scans only that partition.




🔸 33. Avoid Nested Loops on Large Datasets

Check execution plan. If both tables are large, prefer hash join:


SELECT /*+ USE_HASH(e d) */ * FROM employees e JOIN departments d ON e.dept_id = d.dept_id;

Hash joins scale better for large sets.




🔸 34. Use DECODE or CASE for Conditional Aggregation


SELECT department_id, COUNT(CASE WHEN gender = 'M' THEN 1 END) AS male_count, COUNT(CASE WHEN gender = 'F' THEN 1 END) AS female_count FROM employees GROUP BY department_id;

Efficiently replaces multiple queries with a single grouped result.



🔸 35. Use TRUNC() Smartly for Date Comparisons

Bad:


SELECT * FROM sales WHERE TRUNC(sale_date) = TO_DATE('2024-01-01', 'YYYY-MM-DD');

Better:


SELECT * FROM sales WHERE sale_date BETWEEN TO_DATE('2024-01-01','YYYY-MM-DD') AND TO_DATE('2024-01-01 23:59:59','YYYY-MM-DD HH24:MI:SS');

Avoids function on column and allows index use.








No comments:

Post a Comment

Understanding SQL Plan Baselines in Oracle Database

  Understanding SQL Plan Baselines in Oracle Database SQL Plan Baseline is the feature in Oracle started from Database 11g that helps to pre...