On 27/10/2023 21:10, Richard Guo wrote:
>
> On Fri, Oct 27, 2023 at 7:00 PM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
>
> So, I can propose two options. First - don't clean only the current
> root
> structure, but also make cleanup of the parent. Although it looks safe,
> I am not happy with this approach - it seems too simple: we should have
> a genuine reason for such a cleaning because it potentially adds
> overhead.
> The second option is to add a flag for not altering queries in
> remove_nulling_relids() - it looks like a mistake when we have two
> different query trees in the root and its parent. Also, it reduces
> memory usage a bit.
> So, if my analysis is correct, it is better to use the second way (see
> attachment).
>
>
> Alternatively, can we look at subroot->parse->targetList instead of
> subquery->targetList where we call estimate_num_groups on the output of
> the subquery?
It is a solution. But does it mask the real problem? In my mind, we copy
node trees to use somewhere else or probe a conjecture. Here, we have
two different representations of the same subquery. Keeping aside the
memory consumption issue, is it correct?
Make sense to apply both options: switch the groups estimation to
subroot targetList and keep one version of a subquery.
In attachment - second (combined) version of the change. Here I added
assertions to check identity of root->parse and incoming query tree.
>
> --- a/src/backend/optimizer/prep/prepunion.c
> +++ b/src/backend/optimizer/prep/prepunion.c
> @@ -341,7 +341,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
> *pNumGroups = subpath->rows;
> else
> *pNumGroups = estimate_num_groups(subroot,
> -
> get_tlist_exprs(subquery->targetList, false),
> +
> get_tlist_exprs(subroot->parse->targetList, false),
> subpath->rows,
> NULL,
> NULL);
>
> BTW, I'm a little surprised that QTW_DONT_COPY_QUERY doesn't seem to be
> used anywhere currently.
I too. But we use this flag in the enterprise fork to reduce memory
consumption. It could be proposed for upstream, but looks a bit unsafe.
I guess, some extensions could do the same.
--
regards,
Andrei Lepikhov
Postgres Professional