On 24 May 2018 at 04:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> That's doable no doubt, but I wonder whether that leaves you in a place
> that's any better than the plan-time-decorrelation approach you proposed
> in the earlier thread. I liked that better TBH; this one seems like
> a very ad-hoc reinvention of a hash join. I don't especially like the
> unpredictable number of executions of the subquery that it results in,
> either.
Decorrelation is not always going to be the answer. There's going to
be plenty of cases where that makes the plan worse.
Consider:
SELECT * FROM sometable s WHERE rarelytrue AND y = (SELECT MAX(x) FROM
bigtable b WHERE b.z = s.z);
If the planner went and re-wrote that to execute as the following would;
SELECT * FROM sometable s LEFT JOIN (SELECT z,MAX(x) max FROM bigtable
GROUP BY z) b ON b.z = s.z
WHERE rarelytrue AND y = b.max;
then we've probably gone and built most of the groups for nothing.
The planner would have do this based on estimated costs. Having the
ability to apply either of these optimisations would be useful,
providing the planner applied them correctly. However, I don't think
Heikki should be touching the decorrelation as part of this effort.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services