Index correlation (was: Moving postgresql.conf tunables into 2003... ) - Mailing list pgsql-performance

From Manfred Koizar
Subject Index correlation (was: Moving postgresql.conf tunables into 2003... )
Date
Msg-id 24k7jv4c62amkgq42ouh4csrjuu5c6s1q4@4ax.com
Whole thread Raw
In response to Re: Moving postgresql.conf tunables into 2003...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Thu, 07 Aug 2003 19:31:52 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>The correlation is between index order and heap order --- that is, are
>the tuples in the table physically in the same order as the index?
>The better the correlation, the fewer heap-page reads it will take to do
>an index scan.

This is true for a column that is the first column of a btree index.
Correlation doesn't help with additional index columns and with
functional indices.

>Note it is possible to measure correlation without regard to whether
>there actually is any index;

But there is no need to, because the correlation is only used for
index access cost estimation.

>One problem we have is extrapolating from the single-column correlation
>stats computed by ANALYZE to appropriate info for multi-column indexes.
>It might be that the only reasonable fix for this is for ANALYZE to
>compute multi-column stats too when multi-column indexes are present.

I wonder whether it would be better to drop column correlation and
calculate index correlation instead, i.e. correlation of index tuples
to heap tuple positions.  This would solve both the multi-column index
and the functional index cost estimation problem.

>People are used to the assumption that you don't need to re-ANALYZE
>after creating a new index, but maybe we'll have to give that up.

Index correlation would be computed on CREATE INDEX and whenever the
heap relation is analysed ...

Servus
 Manfred

pgsql-performance by date:

Previous
From: Jonathan Gardner
Date:
Subject: Re: Perfomance Tuning
Next
From: Andrew Sullivan
Date:
Subject: Re: Perfomance Tuning