Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date
Msg-id CA+TgmoajmDVfmaJDbciC5bgKCkpoPt0yQe3txx7iWwpvRzf4pA@mail.gmail.com
Whole thread Raw
In response to Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>
> Here are the patches.
>
> 1. pg_fdw_core_v2.patch: changes in core related to user mapping handling,
> GUC
>                       enable_foreignjoin

I tried to whittle this patch down to something that I'd be
comfortable committing and ended up with nothing left.

First, I removed the enable_foreignjoin GUC.  I still think an
FDW-specific GUC is better, and I haven't heard anybody make a strong
argument the other way. Your argument that this might be inconvenient
if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
a strictly theoretical problem, considering how much difficulty we're
having getting even one FDW to support join pushdown.  And if it does
happen, the user can script it.  I'm willing to reconsider this point
if there is a massive chorus of support for having this be a core
option rather than an FDW option, but to me it seems that we've gone
to a lot of trouble to make the system extensible and might as well
get some benefit from it.

Second, I removed the documentation for GetForeignTable().  That
function is already documented and doesn't need re-documenting.

Third, I removed GetUserMappingById().  As mentioned in the email to
which I replied earlier, that doesn't actually produce the same result
as the way we're doing it now, and might leave the user ID invalid.
Even if that were no issue, it doesn't seem to add anything.  The only
caller of the new function is  postgresBeginForeignScan(), and that
function already has a way of getting the user mapping.  The new way
doesn't seem to be better or faster, so why bother changing it?

At this point, I was down to just the changes to store the user
mapping ID (umid) in the RelOptInfo, and to consider join pushdown
only if the user mapping IDs match.  One observation I made is that if
the code to initialize the FDW-related fields were lifted from
get_relation_info() up to build_simple_rel(), we would not need to use
planner_rt_fetch(), because the caller already has that information.
That seems like it might be worth considering.  But then I realized a
more fundamental problem: making the plan depend on the user ID is a
problem, because the user ID can be changed, and the plan might be
cached.  The same issue arises for RLS, but there is provision for
that in RevalidateCachedQuery.  This patch makes no similar provision.

I think there are two ways forward here.  One is to figure out a way
for the plancache to invalidate queries using FDW join pushdown when
the user ID changes.  The other is to recheck at execution time
whether the user mapping IDs still match, and if not, fall back to
using the "backup" plan that we need anyway for EvalPlanQual rechecks.
This would of course mean that the backup plan would need to be
something decently efficient, not just whatever we had nearest to
hand.  But that might not be too hard to manage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: make error - libpqdll.def No such file or directory
Next
From: Robert Haas
Date:
Subject: Re: PATCH: postpone building buckets to the end of Hash (in HashJoin)