Home > mailing lists

Re: Sort and index - Mailing list pgsql-performance

From	Manfred Koizar
Subject	Re: Sort and index
Date	May 12, 2005 18:51:59
Msg-id	45678156iqld6sd5n63baovuvqj73palmr@email.aon.at Whole thread Raw
In response to	Re: Sort and index ("Jim C. Nasby" <decibel@decibel.org>)
Responses	Re: Sort and index
List	pgsql-performance

Tree view

On Wed, 11 May 2005 16:15:16 -0500, "Jim C. Nasby" <decibel@decibel.org>
wrote:
>> This is divided by the number of index columns, so the index correlation
>> is estimated to be 0.219.
>
>That seems like a pretty bad assumption to make.

Any assumption we make without looking at entire index tuples has to be
bad.  A new GUC variable secondary_correlation introduced by my patch at
least gives you a chance to manually control the effects of additional
index columns.

>> In my tests I got much more plausible results with
>>
>>     1 - (1 - abs(correlation))^2
>
>What's the theory behind that?

The same as for csquared -- pure intuition.  But the numbers presented
in http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php
seem to imply that in this case my intiution is better ;-)

Actually above formula was not proposed in that mail.  AFAIR it gives
results between p2 and p3.

>And I'd still like to know why correlation squared is used.

On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
|The indexCorrelation^2 algorithm was only a quick hack with no theory
|behind it :-(.

>It depends on the patches, since this is a production machine. Currently
>it's running 7.4.*mumble*,

The patch referenced in
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00931.php is
still available.  It doesn't touch too many places and should be easy to
review.  I'm using it and its predecessors in production for more than
two years.  Let me know, if the 74b1 version does not apply cleanly to
your source tree.

Servus
 Manfred

pgsql-performance by date:

From: Josh Berkus
Date: 12 May 2005, 18:50:35
Subject: Re: Partitioning / Clustering

From: Sebastian Hennebrueder
Date: 12 May 2005, 22:31:28
Subject: Optimize complex join to use where condition before join

Re: Sort and index - Mailing list pgsql-performance

Previous

Next