On Wed, 20 Nov 2024 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Once I'd wrapped my head around how things are done now (which the
> comments in prepunion.c were remarkably unhelpful about), I saw that
> most of the problem for #2 just requires re-ordering things that
> generate_nonunion_paths was already doing. As for #1, I have a modest
> proposal: we should get rid of set_operation_ordered_results_useful
> entirely. It's not the code that actually does useful work, and
> keeping it in sync with the code that does do useful work is hard and
> unnecessary.
>
> 0001-0003 below are the same as before (so the slot-munging TODO is
> still there). 0004 fixes a rather basic bug for nested set-operations
> and gets rid of set_operation_ordered_results_useful along the way.
> Then 0005 does your step 2.
Here's a quick review of all 5 patches together.
1. In setop_load_group(), the primary concern with the result of
setop_compare_slots() seems to be if the tuples match or not. If
you're not too concerned with keeping the Assert checking for
mis-sorts, then it could be much more efficient for many cases to
check the final column first. ExecBuildGroupingEqual() makes use of
this knowledge already, so it's nothing new. Maybe you could just use
an ExprState made by ExecBuildGroupingEqual() instead of
setop_compare_slots() for this case. That would allow JIT to work.
2. (related to #1) I'm unsure if it's also worth deforming all the
needed attributes in one go rather than calling slot_getattr() each
time, which will result in incrementally deforming the tuple.
ExecBuildGroupingEqual() also does that.
3.
/* XXX hack: force the tuple into minimal form */
/* XXX should not have to do this */
ExecForceStoreHeapTuple(ExecCopySlotHeapTuple(innerslot),
setopstate->ps.ps_ResultTupleSlot, true);
innerslot = setopstate->ps.ps_ResultTupleSlot;
nodeAgg.c seems to do this by using prepare_hash_slot() which deforms
the heap tuple and sets the Datums verbatim rather than making copies
of any byref ones.
4. Since the EXPLAIN output is going to change for SetOps, what are
your thoughts on being more explicit about the setop strategy being
used? Showing "SetOp Except" isn't that informative. You've got to
look at the non-text format to know it's using the merge method. How
about "MergeSetOp Except"?
5. You might not be too worried, but in SetOpState if you move
need_init up to below setop_done, you'll close a 7-byte hole in that
struct
I'm also happy if you just push the 0004 patch. Are you planning to
backpatch that? If I'd realised you were proposing to remove that
function, I'd not have fixed the typo Richard mentioned.
David