Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c - Mailing list pgsql-bugs

From Masahiko Sawada
Subject Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c
Date
Msg-id CAD21AoDv6ffWfVTBQgHGNR1rhywpKH246hJWiJvwBtPsGfxmJQ@mail.gmail.com
Whole thread Raw
In response to TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-bugs
On Fri, Aug 22, 2025 at 3:59 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>
> On Wed, Aug 6, 2025 at 8:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> > On Wed, Aug 6, 2025 at 3:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > The attached patch includes the draft fix and regression tests (using
> > > injection points).
> > >
> > > I don't have enough experience with the planner and FDW code area to
> > > evaluate whether the patch fixes the issue in the right approach.
> > > Feedback is very welcome. I've confirmed this assertion could happen
> > > with the same scenario on all supported branches.
> >
> > Will review.  Thank you for the report and patch!
>
> First, my apologies for the delay.
>
> I reviewed the postgres_fdw.c part of the fix:
>
> @@ -6313,9 +6314,18 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
>      * calling foreign_join_ok(), since that function updates fpinfo and marks
>      * it as pushable if the join is found to be pushable.
>      */
> -   if (root->parse->commandType == CMD_DELETE ||
> -       root->parse->commandType == CMD_UPDATE ||
> -       root->rowMarks)
> +   for (PlannerInfo *proot = root; proot != NULL; proot = proot->parent_root)
> +   {
> +       if (proot->parse->commandType == CMD_DELETE ||
> +           proot->parse->commandType == CMD_UPDATE ||
> +           proot->rowMarks)
> +       {
> +           need_epq = true;
> +           break;
> +       }
> +   }
> +
> +   if (need_epq)
>     {
>         epq_path = GetExistingLocalJoinPath(joinrel);
>         if (!epq_path)
>
> I think this successfully avoids the assertion failure and produces
> the correct result, but sorry, I don't think it's the right way to go.
> I think the root cause of this issue is in the EPQ handling of
> foreign/custom joins in ExecScanFetch() in execScan.h: it runs the
> recheck method function for a given foreign/custom join whenever it is
> called for EPQ rechecking, but that is not 100% correct.  I think the
> correct handling is: run the recheck method function for the join if
> it is called for EPQ rechecking and the join is a *descendant* join in
> the EPQ plan tree; otherwise run the access method function for the
> join even if it is called for EPQ rechecking, like the attached (where
> I used the epqParam of the given EPQState to determine whether the
> join is a descendant join or not, which localizes the fix pretty
> well).  For the SELECT FOR UPDATE query shown upthread, when doing an
> EPQ recheck, the fix evaluates the sub-select expression in the target
> list by doing a remote join, not a local join, so it would work more
> efficiently than the fix you proposed.

Thank you for reviewing the patch! I've confirmed that your patch
fixes the issue too.

If I understand your proposed fix correctly, the reported problem is
fixed by not rechecking the test tuple by ForeignRecheck() (performing
a local join), but instead we simply call ForeignNext() and get the
next tuple. I think while this fix would have to have a regression
test like I've proposed, it's probably a good time to add some
regression tests to cover postgresRecheckForeignScan() too possibly as
a separate patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-bugs by date:

Previous
From: Xuneng Zhou
Date:
Subject: Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
Next
From: PG Bug reporting form
Date:
Subject: BUG #19029: Replication Slot size keeps increasing while logical subscription works fine