On Fri, 9 Dec 2022 at 03:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It's true that the cost attributed to the Agg node won't impact any
> live decisions in the plan level in which it appears. However, if
> that's a subquery, then the total cost attributed to the subplan
> could in principle affect plan choices in the outer query. So there
> is a valid argument for wanting to try to get it right.
I guess the jit thresholds are another reason to try to make the costs
a reflection of the expected run-time too.
> Having said that, I think that the actual impact on outer-level choices
> is usually minimal. So it didn't bother me that we swept this under
> the rug before --- and I'm pretty sure that we're sweeping similar
> things under the rug elsewhere in top-of-query planning. However,
> given 1349d279 it should surely be true that the planner knows how many
> sorts it's left to be done at runtime (a number we did not have at hand
> before). So it seems like it ought to be simple enough to account for
> this effect more accurately. I'd be in favor of doing so if it's
> simple and cheap to get the numbers.
Ok, probably Heikki's work in 0a2bc5d61 is a more useful piece of work
to get us closer to that goal. I think all that's required to make it
work is adding on the costs in the final foreach loop in
get_agg_clause_costs(). The Aggrefs have already been marked as
aggpresorted by that time, so it should be a matter of:
if ((aggref->aggorder != NIL || aggref->aggdistinct != NIL) &&
!aggref->aggpresorted)
// add costs for sort
David