On Wed, 2005-04-06 at 18:09 -0400, Tom Lane wrote:
> I wrote:
> > Arjen van der Meijden <acmmailing@vulcanus.its.tudelft.nl> writes:
> >> SELECT COUNT(*) FROM
> >> data_main AS dm,
> >> postcodes AS p
> >> WHERE dm.range BETWEEN p.range_from AND p.range_till
>
> > Planner error ... because it doesn't have any good way to estimate the
> > number of matching rows, it thinks that way is a bit more expensive than
> > data_main as the outside, but in reality it seems a good deal cheaper:
>
> BTW, it would get the right answer if it had recognized the WHERE clause
> as a range restriction --- it still doesn't know exactly what fraction
> of rows will match, but its default estimate is a great deal tighter for
> "WHERE x > something AND x < somethingelse" than it is for two unrelated
> inequality constraints. Enough tighter that it would have gone for the
> correct plan.
>
> The problem is that it doesn't recognize the WHERE as a range constraint
> on dm.range.
> Can anyone suggest a more general rule? Do we need for example to
> consider whether the relation membership is the same in two clauses
> that might be opposite sides of a range restriction? It seems like
>
> a.x > b.y AND a.x < b.z
Not sure we need a more general rule. There's only three ways to view
this pair of clauses:
i) its a range constraint i.e. BETWEEN
ii) its the complement of that i.e. NOT BETWEEN
iii) its a mistake, but we're not allowed to take that path
Arjen's query and your generalisation of it above is a common type of
query - using a lookup of a reference data table with begin/end
effective dates. It would be very useful if this was supported.
> probably can be treated as a range restriction on a.x for this purpose,
> but I'm much less sure that the same is true of
>
> a.x > b.y AND a.x < c.z
I can't think of a query that would use such a construct, and might even
conclude that it was very poorly normalised model. I would suggest that
this is much less common in practical use.
Best Regards, Simon Riggs