Disclaimer

Thursday, 28 November 2024

What Are Histograms?

 


In Oracle Database 19c, histograms are created during statistics collection to help the optimizer make better decisions about query plans, especially when column data is skewed (some values occur much more often than others). 


Here's a explanation:




What Are Histograms?

  • Histograms are like charts that show how data is spread in a column.
  • They help Oracle figure out if some values are more common than others, which is important for estimating how many rows a query will return.



Types of Histograms in 19c

  1. Frequency Histogram:

    • Created when there are a few distinct values in the column (e.g., "Male" and "Female").
    • Shows exact counts for each value.
    • Best for categorical data (e.g., statuses like "Active," "Inactive").
  2. Height-Balanced Histogram:

    • Created when there are many distinct values (e.g., customer IDs).
    • Splits the data into equal-sized buckets (ranges) and shows how many values fall into each bucket.
    • Useful for numeric or range-based data.
  3. Top-Frequency Histogram (New in 19c):

    • Focuses on the most common values and groups the rest into a single "others" bucket.
    • Ideal for skewed data where only a few values occur frequently.
  4. Hybrid Histogram:

    • Combines features of frequency and height-balanced histograms.
    • Used when the column has both skewed data and many distinct values.
    • Gives the optimizer a more accurate picture for complex scenarios.



When Are Histograms Created?

Histograms are generated during statistics collection, either:

  • Automatically during the nightly maintenance window if AUTO_SAMPLE_SIZE is used.
  • Manually, when you collect statistics on a table or column.




What Controls Histogram Creation?

The METHOD_OPT parameter in DBMS_STATS determines:

  1. Whether histograms are created.
  2. How they are created.

Example Settings for METHOD_OPT:

  • FOR ALL COLUMNS SIZE AUTO (default):
    • Oracle decides which columns need histograms and the type (Frequency, Height-Balanced, Hybrid, etc.).
  • FOR ALL COLUMNS SIZE 254:
    • Creates histograms for all columns with up to 254 buckets.
  • FOR COLUMNS SIZE AUTO (col_name):
    • Only create histograms for specific columns.



How to Check or Set METHOD_OPT?

You can check or set the parameter during statistics collection:


-- Check the current default SELECT DBMS_STATS.GET_PREFS('METHOD_OPT') FROM DUAL; -- Set for specific columns BEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname => 'SCHEMA_NAME', tabname => 'TABLE_NAME', method_opt => 'FOR COLUMNS SIZE AUTO (COLUMN_NAME)' ); END; / -- Set globally EXEC DBMS_STATS.SET_GLOBAL_PREFS('METHOD_OPT', 'FOR ALL COLUMNS SIZE AUTO');




Key Enhancements in Oracle 19c

  • Hybrid Histograms:
    • More accurate for complex, skewed data.
  • Top-Frequency Histograms:
    • Efficient for handling columns where only a few values are very frequent.
  • Works seamlessly with Real-Time Statistics (if enabled on Engineered Systems).




Control this behavior using the METHOD_OPT parameter, allowing Oracle to decide or setting specific rules for your data.




No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...