Re: join selectivity - Mailing list pgsql-hackers

From Tom Lane
Subject Re: join selectivity
Date
Msg-id 26274.1103219784@sss.pgh.pa.us
Whole thread Raw
In response to Re: join selectivity  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
Responses Re: join selectivity
List pgsql-hackers
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> OK I think I've misunderstood something more fundamental than that; I
> understood from what you said that the RESTRICT clause is used to evaluate
> the cost of table1.geom && table2.geom against table2.geom && table1.geom
> (i.e. it is used to help decide which one should be seq scanned and which
> should be index scanned in a nested loop node). So is the trick here for a
> commutative operator to simply return the same value for both cases, as
> other factors such as index size costs are considered elsewhere?

If the operator is commutative then the result should be too.  Really
you should not be thinking about costs at all when coding a selectivity
estimator: its charter is to estimate how many rows will match the
condition, not to estimate costs per se.

Note however that these aren't really the "same case", as you'd be
referencing two different columns with presumably different statistics.

> My final question would be how would can we detect the difference between
> RESTRICT being called in this manner (as part of <column> <op> <column> with
> an unknown constant) as opposed to <column> <op> <constant> with a known
> constant?

You should probably read the existing selectivity estimators in
utils/adt/selfuncs.c.  There's a fair amount of infrastructure code in
that file that you could borrow.  (It's not currently exported because
it tends to change from version to version, but maybe we could think
about making some of the routines global.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Richard Huxton
Date:
Subject: Re: [Testperf-general] BufferSync and bgwriter
Next
From: Tom Lane
Date:
Subject: Re: integer datetimes