Nice to know!

Maybe these papers are of interest for you as well, as much as they have

been for me (they have been in this list for at least one time):

http://method-r.com/downloads/doc_download/11-managing-statistics-for-optimal-query-performance-karen-mortonhttp://richardfoote.wordpress.com/2008/01/04/dbms_stats-method_opt-default-behaviour-changed-in-10g-be-careful/HTH

--

Mit freundlichem GruÃŸ

Martin Klier

Senior Oracle Database Administrator

Klug GmbH integrierte Systeme

Lindenweg 13, D-92552 Teunz

Tel.: +49 9671/9216-245

Fax.: +49 9671/9216-112

mailto: martin.klier_at_klug-is.de

www.klug-is.de

GeschÃ¤ftsfÃ¼hrer: Johann Klug, Roman Sorgenfrei

Sitz der Gesellschaft: Teunz, USt-ID-Nr. DE175481608,

HRB Nr. 2037, Amtsgericht Amberg

Neeraj Bhatia schrieb am 06.04.2009 15:56:10:

Von:

Neeraj Bhatia

An:

Martin Klier

Kopie:

oracle-l_at_freelists.org, oracle-l-bounce_at_freelists.org

Datum:

06.04.2009 15:57

Betreff:

Re: Density statistic calculation in case of histogram

Hi Martin,

Version: 10.2.0.1.0

After reading an excellent paper "Histograms - Myths and Facts" by

Wolfgang Breitling, some points get cleared.

1)Â In case of width-based histograms, density= 1/ (2*number of

distinct values)

Thus come the figure 0.00005

2) Without histograms, density should be equal to 1/number of

distinct values. But in my case, it is null. I am not using

method_opt while collecting statistics. The global default setting

is FOR ALL INDEXED COLUMNS SIZE 1.

3) In case of height-based histograms, I didn't get these calculations.

density =

Î£ cnt2 / ( num_rowsËœ * Î£ cnt )

â€œthe sum of the squared frequencies of all non-popular values

divided by the sum of the frequencies of all non-popular values

times the count of rows with not null values of the histogram columnâ€.

Any comments will be appreciated.

Neeraj.

On Mon, Apr 6, 2009 at 5:34 PM, Martin Klier wrote:

Hi Neeraj,

no straight answer to your question, but on which DB version are you

testing?

Keep the estimate_percent default values in mind. Maybe if you force

dbms_stats to compute stats, the result will differ.

--

Mit freundlichem GruÃŸ

Martin Klier

Senior Oracle Database Administrator

Klug GmbH integrierte Systeme

Lindenweg 13, D-92552 Teunz

Tel.: Â +49 9671/9216-245

Fax.: +49 9671/9216-112

mailto: martin.klier_at_klug-is.de

www.klug-is.de

GeschÃ¤ftsfÃ¼hrer: Johann Klug, Roman Sorgenfrei

Sitz der Gesellschaft: Teunz, USt-ID-Nr. DE175481608,

HRB Nr. 2037, Amtsgericht Amberg

oracle-l-bounce_at_freelists.org schrieb am 06.04.2009 08:47:30:

Von:

Neeraj Bhatia

An:

oracle-l@freelists.org

Datum:

06.04.2009 08:48

Betreff:

Density statistic calculation in case of histogram

Gesendet von:

oracle-l-bounce_at_freelists.org

Hi,

Please tell me how density is calculated in case of histograms.

I've a table with very skewed data:

scott_at_ORADB10G> select col_skew, count(*) from tab_skew group by

col_skew order by 1;

Â COL_SKEWÂ Â COUNT(*)

---------- ----------

Â Â Â Â Â Â Â Â 1Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 2Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 3Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 4Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 5Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 6Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 7Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 8Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â Â 9Â Â Â Â Â Â Â Â 10

Â Â Â Â Â Â Â 10Â Â Â Â Â Â 9910

10 rows selected.

scott_at_ORADB10G> exec dbms_stats.delete_table_stats(user,'TAB_SKEW');

PL/SQL procedure successfully completed.

-- Creating width-based histograms, #buckets = #distinct values

scott_at_ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',

method_opt=>'FOR COLUMNS COL_SKEW size 10');

PL/SQL procedure successfully completed.

scott_at_ORADB10G> select * from dba_tab_col_statistics where

table_name='TAB_SKEW' and column_name='COL_SKEW';

OWNERÂ TABLE_NA COLUMN_NAMEÂ Â Â Â Â Â Â Â Â NUM_DISTINCT LOW_VA HIGH_V

DENSITYÂ NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE

------ -------- -------------------- ------------ ------ ------

---------- ---------- ----------- --------- ----------- --- ---

----------- ---------------

SCOTTÂ TAB_SKEW COL_SKEWÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 10 C102Â Â C10B

.00005Â Â Â Â Â Â Â Â Â 0Â Â Â Â Â Â Â Â Â 10 06-APR-09Â Â Â Â Â Â 10000 YES

NOÂ Â Â Â Â Â Â Â Â Â Â 3 FREQUENCY

Question: How density is calculated here? Also it's equal to

Selectivity of non-popular values.

scott_at_ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',

method_opt=>'FOR COLUMNS COL_SKEW size 5');

PL/SQL procedure successfully completed.

-- Creating Height-based histograms, #buckets < #distinct values

scott_at_ORADB10G> select * from dba_tab_col_statistics where

table_name='TAB_SKEW' and column_name='COL_SKEW';

OWNERÂ TABLE_NA COLUMN_NAMEÂ Â Â Â Â Â Â Â Â NUM_DISTINCT LOW_VA HIGH_V

DENSITYÂ NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE

------ -------- -------------------- ------------ ------ ------

---------- ---------- ----------- --------- ----------- --- ---

----------- ---------------

SCOTTÂ TAB_SKEW COL_SKEWÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 10 C102Â Â C10B

.98209Â Â Â Â Â Â Â Â Â 0Â Â Â Â Â Â Â Â Â Â 5 06-APR-09Â Â Â Â Â Â 10000 YES

NOÂ Â Â Â Â Â Â Â Â Â Â 3 HEIGHT BALANCED

Question: How density is calculated here and how selectivity will be

estimated for popolar as well as non-popular values?

Any points/advice will be appreciated.

Neeraj.

†Ûiÿü0ÁúÞzX¬¶Ê+ƒün– {ú+iÉ^