Disclaimer

Thursday, 28 November 2024

Understand METHOD_OPT Parameter for Stats in Oracle

 

In Oracle Database 19c, the METHOD_OPT parameter plays a crucial role in determining how histograms and extended statistics are generated during the collection of table and column statistics. 

This parameter specifies the behavior for collecting column statistics, including whether and how histograms are created for columns.




Key Points about METHOD_OPT in Oracle 19c:

  1. Default Value:

    • METHOD_OPT defaults to 'FOR ALL COLUMNS SIZE AUTO' in Oracle 19c. This allows Oracle to decide which columns require histograms based on their data distribution and usage patterns.
  2. Histograms:

    • Histograms provide detailed data distribution statistics for a column, which the optimizer uses to create better execution plans, especially for skewed data.
  3. Extended Statistics:

    • Extended statistics capture relationships between columns in a table, allowing the optimizer to better estimate cardinality when multiple columns are used in a query's predicates.


METHOD_OPT Syntax Options:

  1. FOR ALL COLUMNS SIZE AUTO (Default):

    • Oracle automatically determines which columns need histograms based on data distribution and workload.
    • Recommended for most environments.

    Example:

    EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES', METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO');
  2. FOR ALL COLUMNS SIZE REPEAT:

    • Recreates histograms only for columns that already have them. No new histograms are created.
    • Useful when updating statistics without altering existing histogram settings.

    Example:

    EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES', METHOD_OPT => 'FOR ALL COLUMNS SIZE REPEAT');
  3. FOR ALL COLUMNS SIZE SKEWONLY:

    • Creates histograms only for columns with skewed data distributions.
    • Suitable for tables with some skewed columns and some uniformly distributed ones.

    Example:

    EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES', METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY');
  4. FOR ALL COLUMNS SIZE integer:

    • Specifies the exact number of buckets (1 to 254 in 11g, up to 2048 in 12c and later).
    • Setting SIZE 1 disables histograms for the columns.

    Example:

    EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES', METHOD_OPT => 'FOR ALL COLUMNS SIZE 254');
  5. FOR COLUMNS ...:

    • Collects statistics for specific columns with customized bucket settings.

    Example:

    EXEC DBMS_STATS.GATHER_TABLE_STATS('HR','EMPLOYEES', METHOD_OPT => 'FOR COLUMNS (SALARY SIZE 10, DEPARTMENT_ID SIZE 20)');




Extended Statistics:

  • Usage: When there is a correlation between two or more columns, extended statistics help Oracle generate better query plans.
  • Creation:

    EXEC DBMS_STATS.CREATE_EXTENDED_STATS('HR', 'EMPLOYEES', '(SALARY, DEPARTMENT_ID)'); EXEC DBMS_STATS.GATHER_TABLE_STATS('HR', 'EMPLOYEES');
  • Viewing Extended Statistics:

    SELECT EXTENSION_NAME, EXTENSION FROM DBA_STAT_EXTENSIONS WHERE TABLE_NAME = 'EMPLOYEES';




Modifying the Default METHOD_OPT:

  • Change the global preference for all future statistics gathering:

    EXEC DBMS_STATS.SET_GLOBAL_PREFS('METHOD_OPT', 'FOR ALL COLUMNS SIZE REPEAT'); SELECT DBMS_STATS.GET_PREFS('METHOD_OPT') FROM DUAL;




When to Use/Not Use Histograms:

Use Histograms When:

  • Columns are frequently used in WHERE clauses.
  • Data distribution is skewed.
  • Selectivity of the column has a significant impact on query performance.

Avoid Histograms When:

  • Data is uniformly distributed.
  • Columns are rarely used in queries.
  • Column data changes frequently, as histograms need frequent recomputation.


The appropriate use of the METHOD_OPT parameter ensures that statistics collection optimally supports Oracle's query optimizer, improving the performance and reliability of database applications.






Oracle Database 19c introduces two advanced statistics features that enhance the optimizer's ability to generate accurate execution plans in real-time. 

These features are particularly beneficial for dynamic or highly volatile environments but are currently exclusive to Oracle Engineered Systems, such as Exadata. Here's an overview of these features:


• Real-Time Statistics 

• High-Frequency Automatic Optimizer Statistics Collection



1. Real-Time Statistics

What it Does:

  • Real-time statistics capture dynamic changes in table data during DML operations (INSERT, UPDATE, DELETE). These statistics are immediately available to the optimizer without waiting for scheduled or manual statistics collection.


Key Benefits:

  • Ensures the optimizer has up-to-date statistics, reducing the risk of suboptimal execution plans due to stale statistics.
  • Particularly useful for environments with rapidly changing data, where traditional statistics collection may lag behind.


How it Works:

  • Statistics are updated directly in the dictionary when DML statements occur. These updates include information such as row counts, column statistics, and high-water marks.
  • Real-time statistics are used in conjunction with traditional statistics, providing a hybrid model for the optimizer.


Example Scenario:

  • A table receiving thousands of rows through real-time inserts: Real-time statistics ensure the optimizer is aware of the updated row count and can adjust execution plans accordingly.


Enabling Real-Time Statistics:

  • Enabled by default on Engineered Systems.
  • Controlled via the OPTIMIZER_REAL_TIME_STATISTICS initialization parameter.

ALTER SYSTEM SET OPTIMIZER_REAL_TIME_STATISTICS = TRUE;




2. High-Frequency Automatic Optimizer Statistics Collection


What it Does:

  • Enhances the traditional automatic statistics gathering mechanism by running at a higher frequency to keep statistics more current.
  • Targets volatile tables with frequent changes to ensure they always have fresh statistics.


Key Benefits:

  • Reduces the need for manual intervention in collecting statistics for volatile tables.
  • Improves query performance by ensuring that the optimizer has up-to-date information.


How it Works:

  • Automatically identifies frequently modified tables.
  • Collects statistics more frequently than the standard nightly maintenance window.
  • Integrates seamlessly with other optimizer features, such as dynamic sampling and real-time statistics.


Example Scenario:

  • A table used in high-frequency transaction processing, where rows are continuously inserted and updated. High-frequency statistics collection ensures accurate cardinality estimates for query execution.


Enabling High-Frequency Statistics Collection:

  • Activated automatically on Engineered Systems.
  • Controlled via the HIGHFREQ_STATS_AUTOCOLLECTION preference.
EXEC DBMS_STATS.SET_GLOBAL_PREFS('HIGHFREQ_STATS_AUTOCOLLECTION', 'TRUE');



Key Limitations:

  • Platform Restriction: Currently available only on Oracle Engineered Systems.
  • Resource Usage: These features may increase system overhead, but Engineered Systems are optimized to handle this.


No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...