Re: postgres_fdw: wrong results with self join + enable_nestloop off - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: postgres_fdw: wrong results with self join + enable_nestloop off
Date
Msg-id CAPmGK1727uiNCmUNH2YheMtdkHT-W9jqF8K7v2M0B=wEc4PceA@mail.gmail.com
Whole thread Raw
In response to Re: postgres_fdw: wrong results with self join + enable_nestloop off  (Andres Freund <andres@anarazel.de>)
Responses Re: postgres_fdw: wrong results with self join + enable_nestloop off
List pgsql-hackers
Sorry, I hit the send button by mistake.

On Sun, Aug 20, 2023 at 4:34 AM Andres Freund <andres@anarazel.de> wrote:
> On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
> > * The problem we had with the set_join_pathlist_hook hook is that in
> > such a typical use case, previously, if the replaced joins had any
> > pseudoconstant clauses, the planner would produce incorrect query
> > plans, due to the lack of support for handling such quals in
> > createplan.c.  We could fix the extensions side, as you proposed, but
> > the cause of the issue is 100% the planner's deficiency, so it would
> > be unreasonable to force the authors to do so, which would also go
> > against our policy of ABI compatibility.  So I fixed the core side, as
> > in the FDW case, so that extensions created for such a typical use
> > case, which I guess are the majority of the hook extensions, need not
> > be modified/recompiled.  I think it is unfortunate that that breaks
> > the use case of the Citus extension, though.
>
> I'm not neutral - I don't work on citus, but work in the same Unit as
> Onder. With that said: I don't think that's really a justification for
> breaking a pre-existing, not absurd, use case in a minor release.
>
> Except that this was only noticed after it was released in a set of minor
> versions, I would say that 6f80a8d9c should just straight up be reverted.
> Skimming the thread there wasn't really any analysis done about breaking
> extensions etc - and that ought to be done before a substantial semantics
> change in a somewhat commonly used hook.  I'm inclined to think that that
> might still be the right path.

I think you misread the thread; actually, we did an analysis and
applied a fix that would avoid ABI breakage (see the commit message
for 6f80a8d9c).  It turned out that that breaks the Citus extension,
though.

Also, this is not such a change; it is just an optimization
disablement.  Let me explain.  This is the commit message for
e7cb7ee14, which added the hook we are discussing:

    Allow FDWs and custom scan providers to replace joins with scans.

    Foreign data wrappers can use this capability for so-called "join
    pushdown"; that is, instead of executing two separate foreign scans
    and then joining the results locally, they can generate a path which
    performs the join on the remote server and then is scanned locally.
    This commit does not extend postgres_fdw to take advantage of this
    capability; it just provides the infrastructure.

    Custom scan providers can use this in a similar way.  Previously,
    it was only possible for a custom scan provider to scan a single
    relation.  Now, it can scan an entire join tree, provided of course
    that it knows how to produce the same results that the join would
    have produced if executed normally.

As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs; if they do so, the restriction added by
6f80a8d9c just diables them to add paths for join pushdown, making the
planner use paths involving local joins, so any breakage (other than
plan changes from custom joins to local joins) would never happen.

So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)

> > BTW: commit 9e9931d2b removed the restriction on the call to the hook
> > extensions, so you might want to back-patch it.
>
> Citus is an extension, not a fork, there's not really a way to just backpatch
> a random commit.

Yeah, I was thinking that that would be your concern.

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: postgres_fdw: wrong results with self join + enable_nestloop off
Next
From: torikoshia
Date:
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)