Re: Assert failure on running a completed portal again - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Assert failure on running a completed portal again
Date
Msg-id 77955.1733850890@sss.pgh.pa.us
Whole thread Raw
In response to Re: Assert failure on running a completed portal again  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Assert failure on running a completed portal again
Re: Assert failure on running a completed portal again
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Dec 5, 2024 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> After looking at this further, I think this whole "run_once"
>> business is badly designed, worse documented, and redundant
>> (as well as buggy).  It can be replaced with three self-contained
>> lines in ExecutePlan, as per the attached.

> Did you look at the commit message for
> 691b8d59281b5177f16fe80858df921f77a8e955? IMHO, that's pretty clear
> about what the goals of this were.

Well, it's not very clear about what implementation limitations we
are trying to protect.

> I haven't tested, but I'm guessing that what happens in Nagata-san's
> case with the committed fix is that we execute the plan once to
> completion in parallel mode; and then the second execute message
> executes the same plan tree again in non-parallel mode but this time
> it ends up returning zero tuples because it's already been executed to
> completion. If that is correct, then I think this is subtly changing
> the rules for parallel Plan trees: before, the rule as I understood
> it, was "only complete execution must supported" but now it's
> "complete executed must be supported and, in addition, a no-op
> re-execution after complete execution must be supported". Off the top
> of my head that seems like it should be fine.

Hmm.  As committed, the re-execution will happen with
use_parallel_mode disabled, which might make that safer, or maybe it's
less safe?  Do you think we need something closer to the upthread
proposal to not run the executor at all during the "re-execution"?
(I'd still be inclined to put that short-circuit into ExecutePlan
not anything higher-level.)

My recollection is that even before parallelism we had some plan node
types that didn't cope well with being called again after they'd once
returned EOF (ie a null tuple).  So maybe a defense against trying to
do that would be wise.  It could probably be as simple as checking
estate->es_finished.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Assert failure on running a completed portal again
Next
From: Robert Haas
Date:
Subject: Re: FileFallocate misbehaving on XFS