On Thu, 07 Aug 2003 19:31:52 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>The correlation is between index order and heap order --- that is, are
>the tuples in the table physically in the same order as the index?
>The better the correlation, the fewer heap-page reads it will take to do
>an index scan.
This is true for a column that is the first column of a btree index.
Correlation doesn't help with additional index columns and with
functional indices.
>Note it is possible to measure correlation without regard to whether
>there actually is any index;
But there is no need to, because the correlation is only used for
index access cost estimation.
>One problem we have is extrapolating from the single-column correlation
>stats computed by ANALYZE to appropriate info for multi-column indexes.
>It might be that the only reasonable fix for this is for ANALYZE to
>compute multi-column stats too when multi-column indexes are present.
I wonder whether it would be better to drop column correlation and
calculate index correlation instead, i.e. correlation of index tuples
to heap tuple positions. This would solve both the multi-column index
and the functional index cost estimation problem.
>People are used to the assumption that you don't need to re-ANALYZE
>after creating a new index, but maybe we'll have to give that up.
Index correlation would be computed on CREATE INDEX and whenever the
heap relation is analysed ...
Servus
Manfred