Re: RFC: planner statistics in 7.2 - Mailing list pgsql-hackers

From Philip Warner
Subject Re: RFC: planner statistics in 7.2
Date
Msg-id 3.0.5.32.20010420104405.02b2ce60@mail.rhyme.com.au
Whole thread Raw
In response to RFC: planner statistics in 7.2  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
At 18:37 19/04/01 -0400, Tom Lane wrote:
>(2) Statistics should be computed on the basis of a random sample of the
>target table, rather than a complete scan.  According to the literature
>I've looked at, sampling a few thousand tuples is sufficient to give good
>statistics even for extremely large tables; so it should be possible to
>run ANALYZE in a short amount of time regardless of the table size.

This sounds great; can the same be done for clustering. ie. pick a random
sample of index nodes, look at the record pointers and so determine how
well clustered the table is?


>A simple approach would be a SET
>variable or explicit parameter for ANALYZE.  But I am inclined to think
>that it'd be better to create a persistent per-column state for this,
>set by say
>    ALTER TABLE tab SET COLUMN col STATS COUNT n

Sounds fine - user-selectability at the column level seems a good idea.
Would there be any value in not making it part of a normal SQLxx statement,
and adding an 'ALTER STATISTICS' command? eg. 
   ALTER STATISTICS FOR tab[.column] COLLECT n   ALTER STATISTICS FOR tab SAMPLE m

etc.









----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: RFC: planner statistics in 7.2
Next
From: Philip Warner
Date:
Subject: Re: RFC: planner statistics in 7.2