On 12/5/24 12:28, Amit Langote wrote:
> On Thu, Dec 5, 2024 at 3:53 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> On Thu, Dec 5, 2024 at 2:20 AM Tomas Vondra <tomas@vondra.me> wrote:
>>> Sure, changing the APIs is allowed, I'm just wondering if maybe there
>>> might be a way to not have this issue, or at least notice the missing
>>> call early.
>>>
>>> I haven't tried, wouldn't it be better to modify ExecutorStart() to do
>>> the retries internally? I mean, the extensions wouldn't need to check if
>>> the plan is still valid, ExecutorStart() would take care of that. Yeah,
>>> it might need some new arguments, but that's more obvious.
>>
>> One approach could be to move some code from standard_ExecutorStart()
>> into ExecutorStart(). Specifically, the code responsible for setting
>> up enough state in the EState to perform ExecDoInitialPruning(), which
>> takes locks that might invalidate the plan. If the plan does become
>> invalid, the hook and standard_ExecutorStart() are not called.
>> Instead, the caller, ExecutorStartExt() in this case, creates a new
>> plan.
>>
>> This avoids the need to add ExecPlanStillValid() checks anywhere,
>> whether in core or extension code. However, it does mean accessing the
>> PlannedStmt earlier than InitPlan(), but the current placement of the
>> code is not exactly set in stone.
>
> I tried this approach and found that it essentially disables testing
> of this patch using the delay_execution module, which relies on the
> ExecutorStart_hook(). The way the testing works is that the hook in
> delay_execution.c pauses the execution of a cached plan to allow a
> concurrent session to drop an index referenced in the plan. When
> unpaused, execution initialization resumes by calling
> standard_ExecutorStart(). At this point, obtaining the lock on the
> partition whose index has been dropped invalidates the plan, which the
> hook detects and reports. It then also reports the successful
> re-execution of an updated plan that no longer references the dropped
> index. Hmm.
>
It's not clear to me why the change disables this testing, and I can't
try without a patch. Could you explain?
thanks
--
Tomas Vondra