Re: Properly pathify the union planner - Mailing list pgsql-hackers

From David Rowley
Subject Re: Properly pathify the union planner
Date
Msg-id CAApHDvq05PiyYzqeOwH36_PqDEkz21d6zqPSTdzgx+HGzv5pkA@mail.gmail.com
Whole thread Raw
In response to Re: Properly pathify the union planner  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Properly pathify the union planner
List pgsql-hackers
On Tue, 6 Feb 2024 at 22:05, Richard Guo <guofenglinux@gmail.com> wrote:
> I'm thinking that maybe it'd be better to move the work of sorting the
> subquery's paths to the outer query level, specifically within the
> build_setop_child_paths() function, just before we stick SubqueryScanPath
> on top of the subquery's paths.  I think this is better because:
>
> 1. This minimizes the impact on subquery planning and reduces the
> footprint within the grouping_planner() function as much as possible.
>
> 2. This can help avoid the aforementioned add_path() issue because the
> two involved paths will be structured as:

Yes, this is a good idea. I agree with both of your points.

I've taken your suggested changes with minor fixups and expanded on it
to do the partial paths too.  I've also removed almost all of the
changes to planner.c.

I fixed a bug where I was overwriting the union child's
TargetEntry.ressortgroupref without consideration that it might be set
for some other purpose in the subquery.  I wrote
generate_setop_child_grouplist() to handle this which is almost like
generate_setop_grouplist() except it calls assignSortGroupRef() to
figure out the next free tleSortGroupRef, (or reuse the existing one
if the TargetEntry already has one set).

Earlier, I pushed a small comment change to pathnode.c in order to
shrink this patch down a little. It was also a chance that could be
made in isolation of this work.

v2 attached.

David

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: index prefetching
Next
From: Andres Freund
Date:
Subject: Re: index prefetching