On Fri, Aug 22, 2025 at 3:59 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>
> On Wed, Aug 6, 2025 at 8:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> > On Wed, Aug 6, 2025 at 3:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > The attached patch includes the draft fix and regression tests (using
> > > injection points).
> > >
> > > I don't have enough experience with the planner and FDW code area to
> > > evaluate whether the patch fixes the issue in the right approach.
> > > Feedback is very welcome. I've confirmed this assertion could happen
> > > with the same scenario on all supported branches.
> >
> > Will review. Thank you for the report and patch!
>
> First, my apologies for the delay.
>
> I reviewed the postgres_fdw.c part of the fix:
>
> @@ -6313,9 +6314,18 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
> * calling foreign_join_ok(), since that function updates fpinfo and marks
> * it as pushable if the join is found to be pushable.
> */
> - if (root->parse->commandType == CMD_DELETE ||
> - root->parse->commandType == CMD_UPDATE ||
> - root->rowMarks)
> + for (PlannerInfo *proot = root; proot != NULL; proot = proot->parent_root)
> + {
> + if (proot->parse->commandType == CMD_DELETE ||
> + proot->parse->commandType == CMD_UPDATE ||
> + proot->rowMarks)
> + {
> + need_epq = true;
> + break;
> + }
> + }
> +
> + if (need_epq)
> {
> epq_path = GetExistingLocalJoinPath(joinrel);
> if (!epq_path)
>
> I think this successfully avoids the assertion failure and produces
> the correct result, but sorry, I don't think it's the right way to go.
> I think the root cause of this issue is in the EPQ handling of
> foreign/custom joins in ExecScanFetch() in execScan.h: it runs the
> recheck method function for a given foreign/custom join whenever it is
> called for EPQ rechecking, but that is not 100% correct. I think the
> correct handling is: run the recheck method function for the join if
> it is called for EPQ rechecking and the join is a *descendant* join in
> the EPQ plan tree; otherwise run the access method function for the
> join even if it is called for EPQ rechecking, like the attached (where
> I used the epqParam of the given EPQState to determine whether the
> join is a descendant join or not, which localizes the fix pretty
> well). For the SELECT FOR UPDATE query shown upthread, when doing an
> EPQ recheck, the fix evaluates the sub-select expression in the target
> list by doing a remote join, not a local join, so it would work more
> efficiently than the fix you proposed.
Thank you for reviewing the patch! I've confirmed that your patch
fixes the issue too.
If I understand your proposed fix correctly, the reported problem is
fixed by not rechecking the test tuple by ForeignRecheck() (performing
a local join), but instead we simply call ForeignNext() and get the
next tuple. I think while this fix would have to have a regression
test like I've proposed, it's probably a good time to add some
regression tests to cover postgresRecheckForeignScan() too possibly as
a separate patch.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com