Disclaimer

Sunday, 4 May 2025

SQL Query tuning in Oracle database

 

Here are practical SQL query tuning and rewriting examples for Oracle Database. These cover common small improvements that can significantly boost performance.



🔸 1. Avoid SELECT *

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better:


SELECT first_name, last_name, salary FROM employees WHERE department_id = 10;

Reduces I/O and memory usage by fetching only needed columns.




🔸 2. Use EXISTS Instead of IN (for subqueries)

Bad:


SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = 1700);

Better:


SELECT * FROM employees e WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id = e.department_id AND d.location_id = 1700);

EXISTS stops at first match, IN evaluates all rows.




🔸 3. Use JOIN Instead of Subqueries (where appropriate)

Bad:

SELECT employee_id, (SELECT department_name FROM departments d WHERE d.department_id = e.department_id)
FROM employees e;


Better:


SELECT e.employee_id, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id;

Joins are usually faster and optimizer-friendly.




🔸 4. Use BIND Variables

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better (PL/SQL or apps):


SELECT * FROM employees WHERE department_id = :dept_id;

Enables SQL reuse and avoids hard parsing.




🔸 5. Create Indexes on Filter Columns


CREATE INDEX emp_dept_idx ON employees(department_id);

Improves performance for queries filtering on department_id.




🔸 6. Use ANALYZE or DBMS_STATS to Keep Stats Updated


EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES');

Helps optimizer choose the best plan.




🔸 7. Use WHERE Clause to Limit Rows

Bad:


SELECT first_name FROM employees;


SELECT first_name FROM employees WHERE department_id = 10;

Reduces full table scans.




🔸 8. Avoid Functions on Indexed Columns

Bad:


SELECT * FROM employees WHERE TO_CHAR(hire_date, 'YYYY') = '2020';

Better:


SELECT * FROM employees WHERE hire_date BETWEEN TO_DATE('2020-01-01','YYYY-MM-DD') AND TO_DATE('2020-12-31','YYYY-MM-DD');

Allows use of index on hire_date.




🔸 9. Use UNION ALL Instead of UNION (if duplicates don’t matter)


SELECT first_name FROM employees UNION SELECT first_name FROM contractors;

Better:

SELECT first_name FROM employees
UNION ALL SELECT first_name FROM contractors;

UNION sorts and removes duplicates, UNION ALL is faster.




🔸 10. Use WITH Clause for Reused Subqueries


WITH dept_data AS ( SELECT department_id FROM departments WHERE location_id = 1700 ) SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM dept_data);

Improves readability and can boost performance for complex queries.






🔹 11. Avoid Starting with Wildcard in LIKE

Bad (causes full table scan):


SELECT * FROM employees WHERE first_name LIKE '%john';

Better:


SELECT * FROM employees WHERE first_name LIKE 'john%';

Index can be used if the pattern doesn’t start with %.




🔹 12. Use IN for Static Filters Instead of Multiple ORs

Bad:


SELECT * FROM employees WHERE department_id = 10 OR department_id = 20 OR department_id = 30;

Better:


SELECT * FROM employees WHERE department_id IN (10, 20, 30);

Simplifies plan and improves performance.




🔹 13. Avoid Sorting Unless Needed (Reduce ORDER BY)

Bad:


SELECT * FROM employees ORDER BY hire_date;

Better (only if sorting is not required):


SELECT * FROM employees;

ORDER BY requires sorting and extra CPU/Temp space.




🔹 14. Avoid Full Table Scans by Using Indexes

Bad (no filter):

SELECT * FROM orders;

Better:


SELECT * FROM orders WHERE order_status = 'OPEN';

Use filters to take advantage of indexes and reduce I/O.




🔹 15. Use GROUP BY with Selective Filtering

Bad:

SELECT department_id, COUNT(*) FROM employees GROUP BY department_id;

Better:

SELECT department_id, COUNT(*)
FROM employees WHERE status = 'ACTIVE' GROUP BY department_id;

Filtering before grouping reduces rows and resource usage.





🔹 16. Avoid Functions on Indexed Columns (Again)

Bad:

SELECT * FROM employees WHERE UPPER(last_name) = 'SMITH';

Better:

  • Create a function-based index:


CREATE INDEX emp_lastname_upper_idx ON employees(UPPER(last_name));

Then:

SELECT * FROM employees WHERE UPPER(last_name) = 'SMITH';

Allows index usage on derived column.




🔹 17. Use Inline Views to Pre-Filter


SELECT dept_name, emp_count FROM ( SELECT d.department_name AS dept_name, COUNT(e.employee_id) AS emp_count FROM employees e JOIN departments d ON e.department_id = d.department_id WHERE e.status = 'ACTIVE' GROUP BY d.department_name ) WHERE emp_count > 5;

Reduces processing in outer query.




🔹18. Use Optimizer Hints Where Needed


SELECT /*+ INDEX(employees emp_status_idx) */ * FROM employees WHERE status = 'ACTIVE';

Forces use of desired index when optimizer chooses wrong path.



🔹 19. Avoid NOT IN with NULLs

Bad:

SELECT * FROM employees WHERE department_id NOT IN (10, 20, NULL);

Better:

SELECT * FROM employees WHERE department_id NOT IN (10, 20);

NULL in NOT IN returns no rows due to unknown comparison.




🔹 20. Use Analytical Functions Instead of Correlated Subqueries

Bad:

SELECT e.*,
(SELECT MAX(salary) FROM employees WHERE department_id = e.department_id) AS max_sal FROM employees e;

Better:

SELECT e.*, MAX(salary) OVER (PARTITION BY department_id) AS max_sal
FROM employees e;

Reduces repetitive subquery execution using analytic window functions.




🔹 21. Use MERGE Instead of Separate INSERT and UPDATE


MERGE INTO target_table t USING source_table s ON (t.id = s.id) WHEN MATCHED THEN UPDATE SET t.name = s.name WHEN NOT MATCHED THEN INSERT (id, name) VALUES (s.id, s.name);

Efficient for upserts.




🔹 22. Use FETCH FIRST Instead of ROWNUM (12c+)

Old way:


SELECT * FROM employees WHERE department_id = 10 AND ROWNUM <= 5;

New way:

SELECT * FROM employees WHERE department_id = 10 FETCH FIRST 5 ROWS ONLY;

More readable and optimizer-aware.




🔹 23. Avoid DISTINCT Unless Needed

Bad:

SELECT DISTINCT department_id FROM employees;

Better:

SELECT department_id FROM employees GROUP BY department_id;

GROUP BY can be more efficient in certain queries.





🔸 24. Avoid SELECT *, Specify Columns

Bad:


SELECT * FROM employees WHERE department_id = 10;

Better:


SELECT first_name, last_name FROM employees WHERE department_id = 10;

Reduces I/O by fetching only needed columns.




🔸 25. Join on Indexed Columns

Bad (no index):


SELECT * FROM orders o JOIN customers c ON o.customer_name = c.customer_name;

Better:


SELECT * FROM orders o JOIN customers c ON o.customer_id = c.customer_id;

Use join keys with indexes for faster lookups.




🔸 26. Push Filters Inside Views

Bad:


SELECT * FROM (SELECT * FROM employees) WHERE department_id = 10;

Better:


SELECT * FROM (SELECT * FROM employees WHERE department_id = 10);

Push predicates down into views or inline subqueries.



🔸 27. Use EXISTS Instead of IN for Subqueries

Bad:


SELECT * FROM employees WHERE department_id IN (SELECT department_id FROM departments);

Better:


SELECT * FROM employees e WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id = e.department_id);

EXISTS stops scanning once a match is found, unlike IN.




🔸 28. Avoid Scalar Subqueries in SELECT

Bad:

SELECT e.name,
(SELECT COUNT(*) FROM tasks t WHERE t.employee_id = e.id) AS task_count FROM employees e;

Better:


SELECT e.name, COUNT(t.task_id) AS task_count FROM employees e LEFT JOIN tasks t ON e.id = t.employee_id GROUP BY e.name;

Avoids repeated subquery execution for each row.




🔸 29. Use WITH Clause for Repeated Subqueries

Without WITH:


SELECT COUNT(*) FROM (SELECT * FROM orders WHERE status = 'OPEN'); SELECT MAX(order_date) FROM (SELECT * FROM orders WHERE status = 'OPEN');

Better (WITH clause):


WITH open_orders AS ( SELECT * FROM orders WHERE status = 'OPEN' ) SELECT COUNT(*) FROM open_orders; SELECT MAX(order_date) FROM open_orders;

Prevents re-evaluation of subquery.




🔸 30. Use Bitmap Indexes for Low Cardinality Columns

If status has few values:


CREATE BITMAP INDEX emp_status_bix ON employees(status);

Bitmap indexes work well for values like 'ACTIVE', 'INACTIVE'.




🔸 31. Avoid Implicit Data Type Conversion

Bad:

SELECT * FROM employees WHERE employee_id = '100'; -- employee_id is NUMBER

Better:


SELECT * FROM employees WHERE employee_id = 100;

Avoids implicit conversion and allows index usage.




🔸 32. Use Partition Pruning

If sales is partitioned by month:


SELECT * FROM sales WHERE sale_month = 'JAN-2024';

Ensures Oracle scans only that partition.




🔸 33. Avoid Nested Loops on Large Datasets

Check execution plan. If both tables are large, prefer hash join:


SELECT /*+ USE_HASH(e d) */ * FROM employees e JOIN departments d ON e.dept_id = d.dept_id;

Hash joins scale better for large sets.




🔸 34. Use DECODE or CASE for Conditional Aggregation


SELECT department_id, COUNT(CASE WHEN gender = 'M' THEN 1 END) AS male_count, COUNT(CASE WHEN gender = 'F' THEN 1 END) AS female_count FROM employees GROUP BY department_id;

Efficiently replaces multiple queries with a single grouped result.



🔸 35. Use TRUNC() Smartly for Date Comparisons

Bad:


SELECT * FROM sales WHERE TRUNC(sale_date) = TO_DATE('2024-01-01', 'YYYY-MM-DD');

Better:


SELECT * FROM sales WHERE sale_date BETWEEN TO_DATE('2024-01-01','YYYY-MM-DD') AND TO_DATE('2024-01-01 23:59:59','YYYY-MM-DD HH24:MI:SS');

Avoids function on column and allows index use.








No comments:

Post a Comment

Oracle Exadata

  Hardware Architecture:- Oracle Exadata Database Machine employs a modern scale-out architecture encompassing database compute, storage, ne...