On Wed, 11 May 2005 16:15:16 -0500, "Jim C. Nasby" <decibel@decibel.org>
wrote:
>> This is divided by the number of index columns, so the index correlation
>> is estimated to be 0.219.
>
>That seems like a pretty bad assumption to make.
Any assumption we make without looking at entire index tuples has to be
bad. A new GUC variable secondary_correlation introduced by my patch at
least gives you a chance to manually control the effects of additional
index columns.
>> In my tests I got much more plausible results with
>>
>> 1 - (1 - abs(correlation))^2
>
>What's the theory behind that?
The same as for csquared -- pure intuition. But the numbers presented
in http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php
seem to imply that in this case my intiution is better ;-)
Actually above formula was not proposed in that mail. AFAIR it gives
results between p2 and p3.
>And I'd still like to know why correlation squared is used.
On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
|The indexCorrelation^2 algorithm was only a quick hack with no theory
|behind it :-(.
>It depends on the patches, since this is a production machine. Currently
>it's running 7.4.*mumble*,
The patch referenced in
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00931.php is
still available. It doesn't touch too many places and should be easy to
review. I'm using it and its predecessors in production for more than
two years. Let me know, if the 74b1 version does not apply cleanly to
your source tree.
Servus
Manfred