Re: An unresolved performance problem. - Mailing list pgsql-performance

From Achilleus Mantzios
Subject Re: An unresolved performance problem.
Date
Msg-id Pine.LNX.4.44.0305091644500.7304-100000@matrix.gatewaynet.com
Whole thread Raw
In response to Re: An unresolved performance problem.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: An unresolved performance problem.
List pgsql-performance
On Fri, 9 May 2003, Tom Lane wrote:

> Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> > On Fri, 9 May 2003, Tom Lane wrote:
> >> Well, that's suggestive, isn't it?  What about the remaining columns?
>
> > The index is defined as:
>
> > status_all btree (assettable, assetidval, appname, apptblname, status,
> > isvalid)
>
> > And correlations are:
>
> >   attname   | correlation
> > -------------+-------------
> >  assettable  |           1
> >  assetidval  |    0.125902
> >  appname     |    0.942771
> >  apptblname  |    0.928761
> >  status      |    0.443405
> >  isvalid     |    0.970531
>
> Actually, thinking twice about it, I'm not sure if the correlations of
> the righthand columns mean anything.  If the table were perfectly
> ordered by the index, you'd expect righthand values to cycle through
> their range for each lefthand value, and so they'd show low
> correlations.

When i clustered (on onother system no to spoil the situation)
CLUSTER status_all on status;
i got identical results on the order (see below),
also i got quite high correlations.


>
> The fact that most of the columns show high correlation makes me think
> that they are not independent --- is that right?

Well, assettable,appname,apptblname
have high frequencies on one value, so they can be
regarded as constants.
assetidval, status and isvalid play the most part of the
selectivity.
(i have included the first 3 columns in the status_all index for future
usage)

>
> But anyway, I'd say that yes this table is probably quite well ordered
> by the index.  You could just visually compare the results of
>
> select * from tab
>
> select * from tab
>   order by assettable, assetidval, appname, apptblname, status, isvalid
>
> to confirm this.
>

If the table was ordered by status_all index i would show something like
   attname   | correlation
-------------+-------------
 assettable  |           1
 assetidval  |           1
 appname     |    0.927842
 apptblname  |    0.895155
 status      |    0.539183
 isvalid     |    0.722838

In the current (production system) situation, visually, i dont see any
correlation between the two.

> And that tells us where the problem is: the code is estimating a low
> index correlation where it should be estimating a high one.  If you
> don't mind running a nonstandard version of Postgres, you could try
> making btcostestimate() in src/backend/utils/adt/selfuncs.c estimate
> the indexCorrelation as just varCorrelation, instead of
> varCorrelation / nKeys.  This is doubtless an overcorrection in the
> other direction (which is why it hasn't been done in the official
> sources) but it's probably better than what's there, at least for
> your purposes.
>

On the test system,
if i cluster the table according to assetidval the optimiser
uses the index on that column which does a pretty good job.
Even better, if i revert the table to an ordering according
to its id (to spoil the previous effect of the CLUSTER command)
and i set random_page_cost = 2 i get the usage of the better
status_all index.

This way the correlations seem low, but the expected selectivity
is either way 83 rows.

Are you suggesting to try the change in src/backend/utils/adt/selfuncs.c
at this exact situation i am on my test system?? (its linux too)

Thanx a lot!

>             regards, tom lane
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: An unresolved performance problem.
Next
From: Achilleus Mantzios
Date:
Subject: Re: An unresolved performance problem.