Richard Guo <guofenglinux@gmail.com> writes:
> On Fri, Feb 28, 2020 at 3:02 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> Glad to see this. I think the hard part is this transform is not *always*
>> good. for example foo.a only has 1 rows, but bar has a lot of rows, if
>> so the original would be the better one.
> Yes exactly. TBH I'm not sure how to achieve that.
Yeah, I was about to make the same objection when I saw Andy already had.
Without some moderately-reliable way of estimating whether the change
is actually a win, I think we're better off leaving it out. The user
can always rewrite the query for themselves if the grouped implementation
would be better -- but if the planner just does it blindly, there's no
recourse when it's worse.
> Any ideas on this part?
I wonder whether it'd be possible to rewrite the query, but then
consider two implementations, one where the equality clause is
pushed down into the aggregating subquery as though it were LATERAL.
You'd want to be able to figure out that the presence of that clause
made it unnecessary to do the GROUP BY ... but having done so, a
plan treating the aggregating subquery as LATERAL ought to be pretty
nearly performance-equivalent to the current way. So this could be
mechanized in the current planner structure by treating that as a
parameterized path for the subquery, and comparing it to unparameterized
paths that calculate the full grouped output.
Obviously it'd be a long slog from here to there, but it seems like
maybe that could be made to work. There's a separate question about
whether it's really worth the trouble, seeing that the optimization
is available today to people who rewrite their queries.
regards, tom lane