Re: On Differing Optimizer Choices ( Again) - Mailing list pgsql-sql

From Tom Lane
Subject Re: On Differing Optimizer Choices ( Again)
Date
Msg-id 6848.999556384@sss.pgh.pa.us
Whole thread Raw
List pgsql-sql
Mark kirkwood <markir@slingshot.co.nz> writes (heavily edited):
> SELECT
> ...
> WHERE d0.d0key = f.d0key
> AND   f.d0key BETWEEN 270 AND 350

> So far this is all as one would expect. However suppose we substitute 
> 'd0.d0key' in the 'AND' clause instead of 'f.d0key' to obtain :

> SELECT
> ...
> WHERE d0.d0key = f.d0key
> AND   d0.d0key BETWEEN 270 AND 350

> [ produces a different plan because of differing row-count estimates ]

This surprises me not at all.  While the planner has some rudimentary
grasp of the notion that equality is transitive, that grasp does not
extend as far as recognizing that the above queries are really
equivalent.  You'd probably get a better plan if you wrote out the
entire WHERE condition that you are thinking is intuitively obvious:

SELECT
...
WHERE d0.d0key = f.d0key
AND   d0.d0key BETWEEN 270 AND 350
AND   f0.d0key BETWEEN 270 AND 350

so that the planner could see that there is a range restriction on each
of the tables.

While it'd be possible to teach the planner to deduce the third clause
from the first two, I'm unconvinced that adding such logic would be a
good idea.  It would slow down all queries (probably by quite a bit)
for a benefit that I suspect arises relatively seldom.  Might be worth
looking at this sometime in the future, but...
        regards, tom lane


pgsql-sql by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: More on the TO DO wishlist
Next
From: Tom Lane
Date:
Subject: Re: Index Scan Backward vs. Sort/Sequential Scan when using ORDER BY