Re: Early Sort/Group resjunk column elimination. - Mailing list pgsql-hackers

From Ronan Dunklau
Subject Re: Early Sort/Group resjunk column elimination.
Date
Msg-id 2961622.EcX9pJ86yZ@aivenronan
Whole thread Raw
In response to Re: Early Sort/Group resjunk column elimination.  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
Le vendredi 16 juillet 2021, 17:37:15 CEST James Coleman a écrit :
> Thanks for hacking on this; as you're not surprised given I made the
> original suggestion, I'm particularly interested in this for
> incremental sort benefits, but I find the other examples you gave
> compelling also.
>
> Of course I haven't seen code yet, but my first intuition is to try to
> avoid adding extra nodes and teach the (hopefully few) relevant nodes
> to remove the resjunk entries themselves. Presumably in this case that
> would mostly be the sort nodes (including gather merge).
>
> One thing to pay attention to here is that we can't necessarily remove
> resjunk entries every time in a sort node since, for example, in
> parallel mode the gather merge node above it will need those entries
> to complete the sort.

Yes that is actually a concern, especially as the merge node is already
handled specially when applying a projection.

>
> I'm interested to see what you're working on with a patch.

I am posting this proof-of-concept, for the record, but I don't think the
numerous problems can be solved easily. I tried to teach Sort to use a limited
sort of projection, but it brings its own slate of problems...

Quick list of problems with the current implementation, leaving aside the fact
that it's quite hacky in a few places:

* result nodes are added for numerous types of non-projection-capable paths,
since the above (final) target includes resjunk columns which should be
eliminated.
* handling of appendrel seems difficult, as both ordered and unordered appends
are generated at the same time against the same target
* I'm having trouble understanding the usefulness of a building physical
tlists for SubqueryScans

The second patch is a very hacky way to try to eliminate some generated result
nodes. The idea is to bypass the whole interpreter when using a "simple"
projection which is just a reduction of the number of columns, and teach Sort
and Result to perform it. To do this, I added a parameter to
is_projection_capable_path to make the test depend on the actual asked target:
for a sort node, only a "simple" projection.

The implementation uses a junkfilter which assumes nothing else than Const and
outer var will be present.

I don't feel like this is going anywhere, but at least it's here for
discussion and posterity, if someone is interested.


--
Ronan Dunklau
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: logical decoding and replication of sequences
Next
From: Andres Freund
Date:
Subject: Re: Avoid stack frame setup in performance critical routines using tail calls