Re: An unresolved performance problem. - Mailing list pgsql-performance

From Tom Lane
Subject Re: An unresolved performance problem.
Date
Msg-id 249.1052487012@sss.pgh.pa.us
Whole thread Raw
In response to Re: An unresolved performance problem.  (Achilleus Mantzios <achill@matrix.gatewaynet.com>)
Responses Re: An unresolved performance problem.  (Achilleus Mantzios <achill@matrix.gatewaynet.com>)
List pgsql-performance
Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> On Fri, 9 May 2003, Tom Lane wrote:
>> Well, that's suggestive, isn't it?  What about the remaining columns?

> The index is defined as:

> status_all btree (assettable, assetidval, appname, apptblname, status,
> isvalid)

> And correlations are:

>   attname   | correlation
> -------------+-------------
>  assettable  |           1
>  assetidval  |    0.125902
>  appname     |    0.942771
>  apptblname  |    0.928761
>  status      |    0.443405
>  isvalid     |    0.970531

Actually, thinking twice about it, I'm not sure if the correlations of
the righthand columns mean anything.  If the table were perfectly
ordered by the index, you'd expect righthand values to cycle through
their range for each lefthand value, and so they'd show low
correlations.

The fact that most of the columns show high correlation makes me think
that they are not independent --- is that right?

But anyway, I'd say that yes this table is probably quite well ordered
by the index.  You could just visually compare the results of

select * from tab

select * from tab
  order by assettable, assetidval, appname, apptblname, status, isvalid

to confirm this.

And that tells us where the problem is: the code is estimating a low
index correlation where it should be estimating a high one.  If you
don't mind running a nonstandard version of Postgres, you could try
making btcostestimate() in src/backend/utils/adt/selfuncs.c estimate
the indexCorrelation as just varCorrelation, instead of
varCorrelation / nKeys.  This is doubtless an overcorrection in the
other direction (which is why it hasn't been done in the official
sources) but it's probably better than what's there, at least for
your purposes.

            regards, tom lane


pgsql-performance by date:

Previous
From: Achilleus Mantzios
Date:
Subject: Re: An unresolved performance problem.
Next
From: Achilleus Mantzios
Date:
Subject: Re: An unresolved performance problem.