Thread: Oddity in handling of cached plans for FDW queries

Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

12 July 2016, 12:44:36

Hi,

I noticed odd behavior in invalidating a cached plan with a foreign join
due to user mapping changes.  Consider:

postgres=# create extension postgres_fdw;
postgres=# create server loopback foreign data wrapper postgres_fdw
options (dbname 'postgres');
postgres=# create user mapping for public server loopback;
postgres=# create table t1 (a int, b int);
postgres=# insert into t1 values (1, 1);
postgres=# create foreign table ft1 (a int, b int) server loopback
options (table_name 't1');
postgres=# analyze ft1;
postgres=# create view v1 as select * from ft1;
postgres=# create user v1_owner;
postgres=# alter view v1 owner to v1_owner;
postgres=# grant select on ft1 to v1_owner;
postgres=# prepare join_stmt as select * from ft1, v1 where ft1.a = v1.a;
postgres=# explain verbose execute join_stmt;
                                                   QUERY PLAN
--------------------------------------------------------------------------------------------------------------
  Foreign Scan  (cost=100.00..102.04 rows=1 width=16)
    Output: ft1.a, ft1.b, ft1_1.a, ft1_1.b
    Relations: (public.ft1) INNER JOIN (public.ft1)
    Remote SQL: SELECT r1.a, r1.b, r5.a, r5.b FROM (public.t1 r1 INNER
JOIN public.t1 r5 ON (((r1.a = r5.a))))
(4 rows)

That's great.

postgres=# create user mapping for v1_owner server loopback;
postgres=# explain verbose execute join_stmt;
                                   QUERY PLAN
------------------------------------------------------------------------------
  Nested Loop  (cost=200.00..202.07 rows=1 width=16)
    Output: ft1.a, ft1.b, ft1_1.a, ft1_1.b
    Join Filter: (ft1.a = ft1_1.a)
    ->  Foreign Scan on public.ft1  (cost=100.00..101.03 rows=1 width=8)
          Output: ft1.a, ft1.b
          Remote SQL: SELECT a, b FROM public.t1
    ->  Foreign Scan on public.ft1 ft1_1  (cost=100.00..101.03 rows=1
width=8)
          Output: ft1_1.a, ft1_1.b
          Remote SQL: SELECT a, b FROM public.t1
(9 rows)

That's also great.  Since ft1 and v1 use different user mappings, the
join should not be pushed down.

postgres=# drop user mapping for v1_owner server loopback;
postgres=# explain verbose execute join_stmt;
                                   QUERY PLAN
------------------------------------------------------------------------------
  Nested Loop  (cost=200.00..202.07 rows=1 width=16)
    Output: ft1.a, ft1.b, ft1_1.a, ft1_1.b
    Join Filter: (ft1.a = ft1_1.a)
    ->  Foreign Scan on public.ft1  (cost=100.00..101.03 rows=1 width=8)
          Output: ft1.a, ft1.b
          Remote SQL: SELECT a, b FROM public.t1
    ->  Foreign Scan on public.ft1 ft1_1  (cost=100.00..101.03 rows=1
width=8)
          Output: ft1_1.a, ft1_1.b
          Remote SQL: SELECT a, b FROM public.t1
(9 rows)

Seems odd to me.  Both relations use the same user mapping as before, so
the join should be pushed down.

Another thing I noticed is that the code in plancache.c would invalidate
a cached plan with a foreign join due to user mapping changes even in
the case where user mappings are meaningless to the FDW.

To fix the first, I'd like to propose (1) replacing the existing
has_foreign_join flag in the CachedPlan data structure with a new flag,
say uses_user_mapping, that indicates whether a cached plan uses any
user mapping regardless of whether the cached plan has foreign joins or
not (likewise, replace the hasForeignJoin flag in PlannedStmt), and (2)
invalidating the cached plan if the uses_user_mapping flag is true.  I
thought that we invalidate the cached plan if the flag is true and there
is a possibility of performing foreign joins, but it seems to me that
updates on the corresponding catalog are infrequent enough that such a
detailed tracking is not worth the effort.  One benefit of using the
proposed approach is that we could make the FDW's handling of user
mappings in BeginForeignScan more efficient; currently, there is
additional overhead caused by catalog re-lookups to obtain the user
mapping information for the simple-foreign-table-scan case where user
mappings mean something to the FDW as in postgres_fdw (probably, updates
on the catalogs are infrequent, though), but we could improve the
efficiency by using the validated user mapping information created at
planning time for that case as well as for the foreign-join case.  For
that, I'd like to propose storing the validated user mapping information
into ForeignScan, not fdw_private.  There is a discussion about using an
ExtensibleNode [1] for that, but the proposed approach would make the
FDW's job much simpler.

I don't think the above change is sufficient to fix the second.  The
root reason for that is that since currently, we allow the user mapping
OID (rel->umid) to be InvalidOid in two cases: (1) user mappings mean
something to the FDW but it can't get any user mapping at planning time
and (2) user mappings are meaningless to the FDW, we cannot distinguish
these two cases.  So, I'd like to introduce a new callback routine to
specify that user mappings mean something to the FDW as proposed by Tom
[2], and use that to reject the former case, which allows us to set the
above uses_user_mapping flag appropriately, ie, set the flag to true
only if user mapping changes require forcing a replan.  This would
change the postgres_fdw's behavior that it allows to prepare statements
involving foreign tables without any user mappings as long as those
aren't required at planning time, but I'm not sure that it's a good idea
to keep that behavior because ISTM that such a behavior would make
people sloppy about creating user mappings, which could lead to latent
security problems [2].

Attached is a proposed patch for that.

Best regards,
Etsuro Fujita

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZK6BB4RTb5paz45Vme%3Dq6Z3D7FF2-VKdVyQCS1ps-cGw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/18299.1457923919%40sss.pgh.pa.us

Attachment

user-mapping-handling.patch

Re: Oddity in handling of cached plans for FDW queries

From

Ashutosh Bapat

Date:

13 July 2016, 09:00:39

Seems odd to me. Both relations use the same user mapping as before, so the join should be pushed down.

Another thing I noticed is that the code in plancache.c would invalidate a cached plan with a foreign join due to user mapping changes even in the case where user mappings are meaningless to the FDW.

To fix the first, I'd like to propose (1) replacing the existing has_foreign_join flag in the CachedPlan data structure with a new flag, say uses_user_mapping, that indicates whether a cached plan uses any user mapping regardless of whether the cached plan has foreign joins or not (likewise, replace the hasForeignJoin flag in PlannedStmt), and (2) invalidating the cached plan if the uses_user_mapping flag is true. I thought that we invalidate the cached plan if the flag is true and there is a possibility of performing foreign joins, but it seems to me that updates on the corresponding catalog are infrequent enough that such a detailed tracking is not worth the effort.

That way we will have plan cache invalidations even when simple foreign tables scans (not join) are involved, which means all the plans involving any reference to a foreign table with valid user mapping associated with it. That can be a huge cost as compared to the current solution where sub-optimal plan will be used only when a user mapping is changed while a statement has been prepared. That's a rare scenario and somebody can work around that by preparing the statement again. IIRC, we had discussed this situation when implementing the cache invalidation logic. But there's no workaround for your solution.

One benefit of using the proposed approach is that we could make the FDW's handling of user mappings in BeginForeignScan more efficient; currently, there is additional overhead caused by catalog re-lookups to obtain the user mapping information for the simple-foreign-table-scan case where user mappings mean something to the FDW as in postgres_fdw (probably, updates on the catalogs are infrequent, though), but we could improve the efficiency by using the validated user mapping information created at planning time for that case as well as for the foreign-join case. For that, I'd like to propose storing the validated user mapping information into ForeignScan, not fdw_private.

postgres_fdw to fetches user mapping in some cases but never remembers it. If what you are describing is a better way, it should have been implemented before join pushdown was implemented. Refetching a user mapping is not that expensive given that there is a high chance that it will be found in the syscache, because it was accessed at the time of planning. Effect of plan cache invalidation is probably worse than fetching the value from a sys cache again.

There is a discussion about using an ExtensibleNode [1] for that, but the proposed approach would make the FDW's job much simpler.

I don't think the above change is sufficient to fix the second. The root reason for that is that since currently, we allow the user mapping OID (rel->umid) to be InvalidOid in two cases: (1) user mappings mean something to the FDW but it can't get any user mapping at planning time and (2) user mappings are meaningless to the FDW, we cannot distinguish these two cases.

The way to differentiate between these two is to look at the serverid. If server id is invalid it's the case 1, otherwise 2. For a simple table, there will always be a serverid associated with it. A user mapping will always be associated with a simple table if there is one i.e. FDW requires it. Albeit, there will be a case when FDW requires a user mapping but it's not created, in which case the table will be useless for fetching remote data anyway. I don't think we should be programming for that case.

So, I'd like to introduce a new callback routine to specify that user mappings mean something to the FDW as proposed by Tom [2], and use that to reject the former case, which allows us to set the above uses_user_mapping flag appropriately, ie, set the flag to true only if user mapping changes require forcing a replan. This would change the postgres_fdw's behavior that it allows to prepare statements involving foreign tables without any user mappings as long as those aren't required at planning time, but I'm not sure that it's a good idea to keep that behavior because ISTM that such a behavior would make people sloppy about creating user mappings, which could lead to latent security problems [2].

Attached is a proposed patch for that.

This routine is meaningless unless the core (or FDW) does not allow a user mapping to be created for such FDWs. Without that, core code would get confused as to what it should do when it sees a user mapping for an FDW which says user mappings are meaningless. If we do that, we might not set hasForeignJoin flag in create_foreignscan_plan() when the user mapping for pushed down join is invalid. That will keep FDWs which do not use user mappings out of plan cache invalidation.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

14 July 2016, 11:45:51

On 2016/07/13 18:00, Ashutosh Bapat wrote:
>     To fix the first, I'd like to propose (1) replacing the existing
>     has_foreign_join flag in the CachedPlan data structure with a new
>     flag, say uses_user_mapping, that indicates whether a cached plan
>     uses any user mapping regardless of whether the cached plan has
>     foreign joins or not (likewise, replace the hasForeignJoin flag in
>     PlannedStmt), and (2) invalidating the cached plan if the
>     uses_user_mapping flag is true.

> That way we will have plan cache invalidations even when simple foreign
> tables scans (not join) are involved, which means all the plans
> involving any reference to a foreign table with valid user mapping
> associated with it. That can be a huge cost as compared to the current
> solution where sub-optimal plan will be used only when a user mapping is
> changed while a statement has been prepared. That's a rare scenario and
> somebody can work around that by preparing the statement again.

I'm not sure that's a good workaround.  ISTM that people often don't pay 
much attention to plan changes, so they would execute the inefficient 
plan without realizing the plan change, it would take long, they would 
start thinking what's happening there, and finally, they would find that 
the reason for that is due to the plan change.  I think we should 
prevent such a trouble.

> IIRC, we
> had discussed this situation when implementing the cache invalidation
> logic.

I didn't know that.  Sorry for speaking up late.

> But there's no workaround for your solution.

As you said, this is a rare scenario; in many cases, people define user 
mappings properly beforehand.  So, just invalidating all relevant plans 
on the syscache invalidation events would be fine.  (I thought one 
possible improvement might be to track exactly the dependencies of plans 
on user mappings and invalidate just those plans that depend on the user 
mapping being modified the same way for user-defined functions, but I'm 
not sure it's worth complicating the code.)

>     One benefit of using the proposed approach is that we could make the
>     FDW's handling of user mappings in BeginForeignScan more efficient;
>     currently, there is additional overhead caused by catalog re-lookups
>     to obtain the user mapping information for the
>     simple-foreign-table-scan case where user mappings mean something to
>     the FDW as in postgres_fdw (probably, updates on the catalogs are
>     infrequent, though), but we could improve the efficiency by using
>     the validated user mapping information created at planning time for
>     that case as well as for the foreign-join case.

> postgres_fdw to fetches user mapping in some cases but never remembers
> it. If what you are describing is a better way, it should have been
> implemented before join pushdown was implemented. Refetching a user
> mapping is not that expensive given that there is a high chance that it
> will be found in the syscache, because it was accessed at the time of
> planning.

That assumption is reasonably valid if execution is done immediately 
after planning, but that doesn't necessarily follow.

> Effect of plan cache invalidation is probably worse than
> fetching the value from a sys cache again.

As I said above, we could expect updates on pg_user_mapping to be 
infrequent, so the effect of the plan cache invalidation would be more 
limited than that of re-fetching user mappings during BeginForeignScan.

>     I don't think the above change is sufficient to fix the second.  The
>     root reason for that is that since currently, we allow the user
>     mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
>     mappings mean something to the FDW but it can't get any user mapping
>     at planning time and (2) user mappings are meaningless to the FDW,
>     we cannot distinguish these two cases.

> The way to differentiate between these two is to look at the serverid.
> If server id is invalid it's the case 1,

Really?  Maybe my explanation was not good, but consider a foreign join 
plan created through GetForeignJoinPaths, by an FDW to which user 
mappings are meaningless, like file_fdw.  In that plan, the 
corresponding server id would be valid, not invalid.  No?

>     So, I'd like to introduce a new callback routine to specify that
>     user mappings mean something to the FDW as proposed by Tom [2], and
>     use that to reject the former case, which allows us to set the above
>     uses_user_mapping flag appropriately, ie, set the flag to true only
>     if user mapping changes require forcing a replan.

> This routine is meaningless unless the core (or FDW) does not allow a
> user mapping to be created for such FDWs. Without that, core code would
> get confused as to what it should do when it sees a user mapping for an
> FDW which says user mappings are meaningless.

The core wouldn't care about such a user mapping for the FDW; the core 
would just ignore the user mapping.  No?

Best regards,
Etsuro Fujita

Re: Oddity in handling of cached plans for FDW queries

From

Ashutosh Bapat

Date:

14 July 2016, 12:26:42

On Thu, Jul 14, 2016 at 5:10 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/07/13 18:00, Ashutosh Bapat wrote:
To fix the first, I'd like to propose (1) replacing the existing
has_foreign_join flag in the CachedPlan data structure with a new
flag, say uses_user_mapping, that indicates whether a cached plan
uses any user mapping regardless of whether the cached plan has
foreign joins or not (likewise, replace the hasForeignJoin flag in
PlannedStmt), and (2) invalidating the cached plan if the
uses_user_mapping flag is true.

That way we will have plan cache invalidations even when simple foreign
tables scans (not join) are involved, which means all the plans
involving any reference to a foreign table with valid user mapping
associated with it. That can be a huge cost as compared to the current
solution where sub-optimal plan will be used only when a user mapping is
changed while a statement has been prepared. That's a rare scenario and
somebody can work around that by preparing the statement again.

I'm not sure that's a good workaround. ISTM that people often don't pay much attention to plan changes, so they would execute the inefficient plan without realizing the plan change, it would take long, they would start thinking what's happening there, and finally, they would find that the reason for that is due to the plan change. I think we should prevent such a trouble.

The case you described is other way round. When the statement was prepared the join was not pushed down. A change in user mapping afterwards may allow the join to be pushed down. But right now it won't be, so a user wouldn't see any difference, right?

IIRC, we
had discussed this situation when implementing the cache invalidation
logic.

I didn't know that. Sorry for speaking up late.

But there's no workaround for your solution.

As you said, this is a rare scenario; in many cases, people define user mappings properly beforehand. So, just invalidating all relevant plans on the syscache invalidation events would be fine. (I thought one possible improvement might be to track exactly the dependencies of plans on user mappings and invalidate just those plans that depend on the user mapping being modified the same way for user-defined functions, but I'm not sure it's worth complicating the code.)

Exactly, for a rare scenario, should we be penalizing large number of plans or just continue to use a previously prepared plan when an optimal plan has become available because of changed condition. I would choose second over the first as it doesn't make things worse than they are.

I don't think the above change is sufficient to fix the second. The
root reason for that is that since currently, we allow the user
mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
mappings mean something to the FDW but it can't get any user mapping
at planning time and (2) user mappings are meaningless to the FDW,
we cannot distinguish these two cases.

The way to differentiate between these two is to look at the serverid.
If server id is invalid it's the case 1,

Really? Maybe my explanation was not good, but consider a foreign join plan created through GetForeignJoinPaths, by an FDW to which user mappings are meaningless, like file_fdw. In that plan, the corresponding server id would be valid, not invalid. No?

While planning join, we invalidate user mapping id if the user mappings do not match (see build_join_rel()). In such case, the joinrel will have invalid user mapping (and invalid server id) even though user mapping is associated with the joining tables. The way to differentiate between this case and the case when an FDW doesn't need user mappings and the join is shippable is through valid serverid (see build_join_rel()). Non-availability of a user mapping for a table whose FDW requires user mappings should end up in an error (in FDW code), so we shouldn't add complexity for that case.

So, I'd like to introduce a new callback routine to specify that
user mappings mean something to the FDW as proposed by Tom [2], and
use that to reject the former case, which allows us to set the above
uses_user_mapping flag appropriately, ie, set the flag to true only
if user mapping changes require forcing a replan.

This routine is meaningless unless the core (or FDW) does not allow a
user mapping to be created for such FDWs. Without that, core code would
get confused as to what it should do when it sees a user mapping for an
FDW which says user mappings are meaningless.

The core wouldn't care about such a user mapping for the FDW; the core would just ignore the user mapping. No?

See build_join_rel(). I would like to see, how do you change the conditions below in that function with your proposal.
468     /*
469      * Set up foreign-join fields if outer and inner relation are foreign
470      * tables (or joins) belonging to the same server and using the same user
471      * mapping.
472      *
473      * Otherwise those fields are left invalid, so FDW API will not be called
474      * for the join relation.
475      *
476      * For FDWs like file_fdw, which ignore user mapping, the user mapping id
477      * associated with the joining relation may be invalid. A valid serverid
478      * distinguishes between a pushed down join with no user mapping and a
479      * join which can not be pushed down because of user mapping mismatch.
480      */
481     if (OidIsValid(outer_rel->serverid) &&
482         inner_rel->serverid == outer_rel->serverid &&
483         inner_rel->umid == outer_rel->umid)
484     {
485         joinrel->serverid = outer_rel->serverid;
486         joinrel->umid = outer_rel->umid;
487         joinrel->fdwroutine = outer_rel->fdwroutine;
488     }

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

14 July 2016, 16:32:17

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
> Exactly, for a rare scenario, should we be penalizing large number of plans
> or just continue to use a previously prepared plan when an optimal plan has
> become available because of changed condition. I would choose second over
> the first as it doesn't make things worse than they are.

You seem to be arguing from a premise that the worst consequence is use of
an inefficient plan.  This is false; the worst consequence is use of a
*wrong* plan, specifically one where a join has been pushed down but doing
so is no longer legal because of a user mapping change.  That is, it was
previously correct to access the two remote tables under the same remote
userID but now they should be accessed under different userIDs.  The
result will be that one or the other remote table is accessed under a
userID that is not what the current user mappings say should be used.
That could lead to either unexpected permissions failures (if the
actually-used userID lacks needed permissions) or security breakdowns
(if the actually-used userID has permissions to do something but the
query should not have been allowed to do it).

I do not think fixing this is optional.

I concur with Etsuro-san's dislike for hasForeignJoin; that flag is
underspecified and doesn't convey nearly enough information.  I do not
think a uses_user_mapping flag is much better.  ISTM what should happen is
that any time we decide to push down a foreign join, we should record the
identity of the common user mapping that made that pushdown possible in
the plan's invalItems list.  That would make it possible to invalidate
only the relevant plans when a user mapping is changed.

I believe what Ashutosh is focusing on is whether, when some user mapping
changes, we should invalidate all plans that could potentially now use a
pushed-down join but did not before.  I tend to agree that that's probably
not something we want to do, as it would be very hard to identify just the
plans likely to benefit.  So we would cause replanning of a lot of queries
that won't actually benefit, and in this direction it is indeed only an
optimization not a correctness matter.

Another way we could attack it would be to record the foreign server OID
as an invalItem for any query that has more than one foreign table
belonging to the same foreign server.  Then, invalidate whenever any user
mapping for that server changes.  This would take care of both the case
where a join pushdown becomes possible and where it stops becoming
possible.  But I'm not sure that the extra invalidations would be
worthwhile.

I'm not excited about Etsuro-san's proposal to record user mapping info
in the plan and skip execution-time mapping lookups altogether.  I think
(1) that's solving a problem that hasn't been proven to be a problem,
(2) it doesn't help unless the FDW code is changed to take advantage of
it, which is unlikely to happen for third-party FDWs, and (3) it opens
the door to a class of new bugs, as any failure to invalidate after a
mapping change would become a security fail even for non-join situations.
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

14 July 2016, 18:49:40

I wrote:
> I concur with Etsuro-san's dislike for hasForeignJoin; that flag is
> underspecified and doesn't convey nearly enough information.  I do not
> think a uses_user_mapping flag is much better.  ISTM what should happen is
> that any time we decide to push down a foreign join, we should record the
> identity of the common user mapping that made that pushdown possible in
> the plan's invalItems list.  That would make it possible to invalidate
> only the relevant plans when a user mapping is changed.

I thought a bit more about this and realized that the above doesn't work
too well.  Initially, a join might have been pushed down on the strength
of both userids mapping to a PUBLIC user mapping for the server.  If now
someone does CREATE USER MAPPING to install a new mapping for one of
those userids, we should invalidate the plan --- but there is certainly
not going to be anything in the plan matching the new user mapping.

> Another way we could attack it would be to record the foreign server OID
> as an invalItem for any query that has more than one foreign table
> belonging to the same foreign server.  Then, invalidate whenever any user
> mapping for that server changes.

And that doesn't work so well either, because the most that the plan inval
code is going to have its hands on is (a hash of) the OID of the user
mapping that changed.  We can't tell which server that's for.

On reflection, it seems to me that we've gone wrong by tying planning to
equality of user mappings at all, and the best way to get out of this is
to not do that.  Instead, let's insist that a join can be pushed down only
if the checkAsUser fields of the relevant RTEs are equal.  If they are,
then the same user mapping must apply to both at runtime, whatever it is
--- and we don't need to predetermine that.  With this approach, the need
for plan invalidation due to user mapping changes goes away entirely.

This doesn't cost us anything at all in simple cases such as direct
execution of a query, because all the checkAsUser fields will be equal
(i.e., zero).  And it also doesn't hurt if the potential foreign join is
encapsulated in a view, where all the checkAsUser fields would contain
the OID of the view owner.

The situation where we potentially lose something is a case like
Etsuro-san's original example, where the query contains one foreign table
reference coming from a view and one coming from the outer query, or maybe
from a different view.  In the two-views case we would have to not push
down the join if the views have different owners, even though perhaps both
owners will use the PUBLIC mapping at runtime.  I think that's a narrow
enough case that we can just live with not optimizing it.  In the
view-and-outer-query case, the simplest answer is that we can't push down
because zero is not equal to the view owner's OID.  We could make that a
little better if we know that the query will be executed as the view
owner, so that the relevant user IDs will be the same at runtime.  There
is already some mechanism in the plan cache to track whether a plan
depends on the identity of the user running it (for RLS), so we could use
that to enforce that a plan containing such a pushed-down join is only run
by the same user that owns the view.

Comments?
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

15 July 2016, 02:42:26

On 2016/07/15 3:49, Tom Lane wrote:
> On reflection, it seems to me that we've gone wrong by tying planning to
> equality of user mappings at all, and the best way to get out of this is
> to not do that.  Instead, let's insist that a join can be pushed down only
> if the checkAsUser fields of the relevant RTEs are equal.  If they are,
> then the same user mapping must apply to both at runtime, whatever it is
> --- and we don't need to predetermine that.  With this approach, the need
> for plan invalidation due to user mapping changes goes away entirely.

+1

> The situation where we potentially lose something is a case like
> Etsuro-san's original example, where the query contains one foreign table
> reference coming from a view and one coming from the outer query, or maybe
> from a different view.  In the two-views case we would have to not push
> down the join if the views have different owners, even though perhaps both
> owners will use the PUBLIC mapping at runtime.  I think that's a narrow
> enough case that we can just live with not optimizing it.  In the
> view-and-outer-query case, the simplest answer is that we can't push down
> because zero is not equal to the view owner's OID.  We could make that a
> little better if we know that the query will be executed as the view
> owner, so that the relevant user IDs will be the same at runtime.  There
> is already some mechanism in the plan cache to track whether a plan
> depends on the identity of the user running it (for RLS), so we could use
> that to enforce that a plan containing such a pushed-down join is only run
> by the same user that owns the view.

Seems reasonable to me.

One thing I'm not sure about is: should we insist that a join can be  
pushed down only if the checkAsUser fields of the relevant RTEs are  
equal in the case where user mappings are meaningless to the FDW, like  
file_fdw?

Best regards,
Etsuro Fujita

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

15 July 2016, 02:49:04

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> One thing I'm not sure about is: should we insist that a join can be  
> pushed down only if the checkAsUser fields of the relevant RTEs are  
> equal in the case where user mappings are meaningless to the FDW, like  
> file_fdw?

If we add a mechanism to let us know that the FDW doesn't care, we could
relax the requirement for such cases.  I don't have a strong opinion on
whether that's worthwhile.  It'd depend in part on how many FDWs there
are that don't care, versus those that do; and I have no idea about that.
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Ashutosh Bapat

Date:

15 July 2016, 05:14:03

On Thu, Jul 14, 2016 at 10:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
> Exactly, for a rare scenario, should we be penalizing large number of plans
> or just continue to use a previously prepared plan when an optimal plan has
> become available because of changed condition. I would choose second over
> the first as it doesn't make things worse than they are.

You seem to be arguing from a premise that the worst consequence is use of
an inefficient plan. This is false; the worst consequence is use of a
*wrong* plan, specifically one where a join has been pushed down but doing
so is no longer legal because of a user mapping change. That is, it was
previously correct to access the two remote tables under the same remote
userID but now they should be accessed under different userIDs. The
result will be that one or the other remote table is accessed under a
userID that is not what the current user mappings say should be used.
That could lead to either unexpected permissions failures (if the
actually-used userID lacks needed permissions) or security breakdowns
(if the actually-used userID has permissions to do something but the
query should not have been allowed to do it).

I do not think fixing this is optional.

There is confusion here. The current situation is this:

If at the time of preparing a statement a join can be pushed down to foreign server, we mark that plan as hasForeignJoin. Before execution, if user mapping changes (add/modify/drop), all plans with hasForeignJoin are invalidated. This means the situation you are describing above does not exist in the current code. So, nothing needs to be fixed.

I concur with Etsuro-san's dislike for hasForeignJoin; that flag is
underspecified and doesn't convey nearly enough information. I do not
think a uses_user_mapping flag is much better. ISTM what should happen is
that any time we decide to push down a foreign join, we should record the
identity of the common user mapping that made that pushdown possible in
the plan's invalItems list. That would make it possible to invalidate
only the relevant plans when a user mapping is changed.

This, as you have proposed, does not solve the problem either. Consider a case when a public user mapping was associated with the foreign tables being joined and the join was pushed down while preparing statement and plan. Before execution, a user mapping was added which changed one of the table's associated user mapping from public to the new one. Now the join is not pushable, but the user mapping that was added/changed was not recorded in invalItems, which contains the public user mapping. An improvement to your proposal would be to invalidate plans related to a public user mapping, when any user mapping is added/changed/dropped. But I guess, there are also some problems there too. Adding/dropping/changing a user mapping affects other user mappings as well, unlike say doing that to tables or views. When implementing this, we debated whether it's worth to add that much complexity. We favoured not adding the complexity that time. The code as is not optimistic and might lead to using already created suboptimal plans, but it doesn't have any known bugs there (assuming, I have explained why the above situation doesn't lead to a bug).

I believe what Ashutosh is focusing on is whether, when some user mapping
changes, we should invalidate all plans that could potentially now use a
pushed-down join but did not before. I tend to agree that that's probably
not something we want to do, as it would be very hard to identify just the
plans likely to benefit. So we would cause replanning of a lot of queries
that won't actually benefit, and in this direction it is indeed only an
optimization not a correctness matter.

Right, that's my argument.

Another way we could attack it would be to record the foreign server OID
as an invalItem for any query that has more than one foreign table
belonging to the same foreign server. Then, invalidate whenever any user
mapping for that server changes. This would take care of both the case
where a join pushdown becomes possible and where it stops becoming
possible. But I'm not sure that the extra invalidations would be
worthwhile.

Yes. That wouldn't have the problem I described above. Again, I am not sure whether it's worth making code complex for that case. User mappings are not something added/dropped/modified very frequently.

I'm not excited about Etsuro-san's proposal to record user mapping info
in the plan and skip execution-time mapping lookups altogether. I think
(1) that's solving a problem that hasn't been proven to be a problem,
(2) it doesn't help unless the FDW code is changed to take advantage of
it, which is unlikely to happen for third-party FDWs, and (3) it opens
the door to a class of new bugs, as any failure to invalidate after a
mapping change would become a security fail even for non-join situations.

Yes, I agree.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Oddity in handling of cached plans for FDW queries

From

Ashutosh Bapat

Date:

15 July 2016, 05:27:30

On Fri, Jul 15, 2016 at 12:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:
> I concur with Etsuro-san's dislike for hasForeignJoin; that flag is
> underspecified and doesn't convey nearly enough information. I do not
> think a uses_user_mapping flag is much better. ISTM what should happen is
> that any time we decide to push down a foreign join, we should record the
> identity of the common user mapping that made that pushdown possible in
> the plan's invalItems list. That would make it possible to invalidate
> only the relevant plans when a user mapping is changed.

I thought a bit more about this and realized that the above doesn't work
too well. Initially, a join might have been pushed down on the strength
of both userids mapping to a PUBLIC user mapping for the server. If now
someone does CREATE USER MAPPING to install a new mapping for one of
those userids, we should invalidate the plan --- but there is certainly
not going to be anything in the plan matching the new user mapping.

I replied to your earlier mail before reading this. Ok, so we agree there.

> Another way we could attack it would be to record the foreign server OID
> as an invalItem for any query that has more than one foreign table
> belonging to the same foreign server. Then, invalidate whenever any user
> mapping for that server changes.

And that doesn't work so well either, because the most that the plan inval
code is going to have its hands on is (a hash of) the OID of the user
mapping that changed. We can't tell which server that's for.

I assumed that there is a way to get server's oid from user mapping or we record it to be passed to the invalidation logic. Looks like there's no easy way to do that.

On reflection, it seems to me that we've gone wrong by tying planning to
equality of user mappings at all, and the best way to get out of this is
to not do that. Instead, let's insist that a join can be pushed down only
if the checkAsUser fields of the relevant RTEs are equal. If they are,
then the same user mapping must apply to both at runtime, whatever it is
--- and we don't need to predetermine that. With this approach, the need
for plan invalidation due to user mapping changes goes away entirely.

I have already explained in my earlier mail, that the problem you described doesn't exist. With the invalidation logic we are able to also support pushing down joins between table with different effective user.

This doesn't cost us anything at all in simple cases such as direct
execution of a query, because all the checkAsUser fields will be equal
(i.e., zero). And it also doesn't hurt if the potential foreign join is
encapsulated in a view, where all the checkAsUser fields would contain
the OID of the view owner.

The situation where we potentially lose something is a case like
Etsuro-san's original example, where the query contains one foreign table
reference coming from a view and one coming from the outer query, or maybe
from a different view. In the two-views case we would have to not push
down the join if the views have different owners, even though perhaps both
owners will use the PUBLIC mapping at runtime. I think that's a narrow
enough case that we can just live with not optimizing it. In the
view-and-outer-query case, the simplest answer is that we can't push down
because zero is not equal to the view owner's OID. We could make that a
little better if we know that the query will be executed as the view
owner, so that the relevant user IDs will be the same at runtime. There
is already some mechanism in the plan cache to track whether a plan
depends on the identity of the user running it (for RLS), so we could use
that to enforce that a plan containing such a pushed-down join is only run
by the same user that owns the view.

Join between views on foreign tables or between foreign tables and views containing foreign tables won't be rare. This feature is yet to be released, so we don't know if PostgreSQL users would find it useful. But I do see Oracle users joining views on dblink tables. I would guess same would be the case in PostgreSQL. But I would like to hear from other PostgreSQL FDW users. In such cases, being able to push down a join between foreign tables across view boundaries will be useful.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

15 July 2016, 13:02:13

On 2016/07/15 11:48, Tom Lane wrote:
> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
>> One thing I'm not sure about is: should we insist that a join can be
>> pushed down only if the checkAsUser fields of the relevant RTEs are
>> equal in the case where user mappings are meaningless to the FDW, like
>> file_fdw?

> If we add a mechanism to let us know that the FDW doesn't care, we could
> relax the requirement for such cases.  I don't have a strong opinion on
> whether that's worthwhile.  It'd depend in part on how many FDWs there
> are that don't care, versus those that do; and I have no idea about that.

So, I'd vote for leaving that for future work if necessary.

Here is a patch for that redesign proposed by you; reverts commits
fbe5a3fb73102c2cfec11aaaa4a67943f4474383 and
5d4171d1c70edfe3e9be1de9e66603af28e3afe1, adds changes for that redesign
to the core, and adjusts the postgres_fdw code to that changes.  Also, I
rearranged the postgres_fdw regression tests to match that changes.

Best regards,
Etsuro Fujita

Attachment

fdw-join-pushdown-checkAsUser.patch

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

15 July 2016, 13:36:03

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> On 2016/07/15 11:48, Tom Lane wrote:
>> If we add a mechanism to let us know that the FDW doesn't care, we could
>> relax the requirement for such cases.  I don't have a strong opinion on
>> whether that's worthwhile.  It'd depend in part on how many FDWs there
>> are that don't care, versus those that do; and I have no idea about that.

> So, I'd vote for leaving that for future work if necessary.

Makes sense to me.

> Here is a patch for that redesign proposed by you; reverts commits  
> fbe5a3fb73102c2cfec11aaaa4a67943f4474383 and  
> 5d4171d1c70edfe3e9be1de9e66603af28e3afe1, adds changes for that redesign  
> to the core, and adjusts the postgres_fdw code to that changes.  Also, I  
> rearranged the postgres_fdw regression tests to match that changes.

OK, I'll review this later today.
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

15 July 2016, 21:25:17

I wrote:
> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
>> Here is a patch for that redesign proposed by you; reverts commits  
>> fbe5a3fb73102c2cfec11aaaa4a67943f4474383 and  
>> 5d4171d1c70edfe3e9be1de9e66603af28e3afe1, adds changes for that redesign  
>> to the core, and adjusts the postgres_fdw code to that changes.  Also, I  
>> rearranged the postgres_fdw regression tests to match that changes.

> OK, I'll review this later today.

Pushed, after fooling around with it some more so that it would cover the
case we discussed where the join could be allowed if we restrict the plan
to be executed by the owner of a view used in the query.
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Robert Haas

Date:

20 July 2016, 12:00:12

On Fri, Jul 15, 2016 at 5:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
>>> Here is a patch for that redesign proposed by you; reverts commits
>>> fbe5a3fb73102c2cfec11aaaa4a67943f4474383 and
>>> 5d4171d1c70edfe3e9be1de9e66603af28e3afe1, adds changes for that redesign
>>> to the core, and adjusts the postgres_fdw code to that changes.  Also, I
>>> rearranged the postgres_fdw regression tests to match that changes.
>
>> OK, I'll review this later today.
>
> Pushed, after fooling around with it some more so that it would cover the
> case we discussed where the join could be allowed if we restrict the plan
> to be executed by the owner of a view used in the query.

Mumble.  Why, exactly, was this a good idea?  The upside of commit
45639a0525a58a2700cf46d4c934d6de78349dac is only that you do fewer
plan invalidations, but surely that's not a significant benefit for
most people: user mappings don't change that often.  On the downside,
there are now cases where joins would have gotten pushed down
previously and now they won't.  In other words, you've saved some
replanning activity at the cost of delivering worse plans.  That seems
pretty suspect to me, although I grant that the scenarios where either
the old or the new behavior is actually a problem are all somewhat off
the beaten path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

20 July 2016, 14:15:33

Robert Haas <robertmhaas@gmail.com> writes:
> Mumble.  Why, exactly, was this a good idea?  The upside of commit
> 45639a0525a58a2700cf46d4c934d6de78349dac is only that you do fewer
> plan invalidations, but surely that's not a significant benefit for
> most people: user mappings don't change that often.  On the downside,
> there are now cases where joins would have gotten pushed down
> previously and now they won't.  In other words, you've saved some
> replanning activity at the cost of delivering worse plans.  That seems
> pretty suspect to me, although I grant that the scenarios where either
> the old or the new behavior is actually a problem are all somewhat off
> the beaten path.

I think that you are undervaluing the removal of user-mapping-based plan
invalidation.  That was never more than a kluge, and here is the reason:
we have no way to lock user mappings.  The system whereby we invalidate
plans as a consequence of table DDL changes is bulletproof, because we
(re) acquire locks on the tables used in the plan, then check for
invalidation signals, before deciding whether the plan is stale.  The
corresponding scenario where a user mapping changes between that check
and execution time is unprotected, so that we could end up using a plan
that is insecure for the mappings selected at execution.

Another way we could have removed the race condition is the suggestion
made upthread of embedding the user mapping details right into the plan
instead of looking them up afresh at execution.  But I didn't much like
that approach, per upthread discussion.
        regards, tom lane

Re: Oddity in handling of cached plans for FDW queries

From

Robert Haas

Date:

20 July 2016, 14:57:24

On Wed, Jul 20, 2016 at 10:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Mumble.  Why, exactly, was this a good idea?  The upside of commit
>> 45639a0525a58a2700cf46d4c934d6de78349dac is only that you do fewer
>> plan invalidations, but surely that's not a significant benefit for
>> most people: user mappings don't change that often.  On the downside,
>> there are now cases where joins would have gotten pushed down
>> previously and now they won't.  In other words, you've saved some
>> replanning activity at the cost of delivering worse plans.  That seems
>> pretty suspect to me, although I grant that the scenarios where either
>> the old or the new behavior is actually a problem are all somewhat off
>> the beaten path.
>
> I think that you are undervaluing the removal of user-mapping-based plan
> invalidation.  That was never more than a kluge, and here is the reason:
> we have no way to lock user mappings.  The system whereby we invalidate
> plans as a consequence of table DDL changes is bulletproof, because we
> (re) acquire locks on the tables used in the plan, then check for
> invalidation signals, before deciding whether the plan is stale.  The
> corresponding scenario where a user mapping changes between that check
> and execution time is unprotected, so that we could end up using a plan
> that is insecure for the mappings selected at execution.

OK, that's a fair point.  Thanks for explaining.

> Another way we could have removed the race condition is the suggestion
> made upthread of embedding the user mapping details right into the plan
> instead of looking them up afresh at execution.  But I didn't much like
> that approach, per upthread discussion.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

21 July 2016, 07:40:09

On 2016/07/16 6:25, Tom Lane wrote:
> Pushed, after fooling around with it some more so that it would cover the
> case we discussed where the join could be allowed if we restrict the plan
> to be executed by the owner of a view used in the query.

I left that for future work, but I'm happy to have that in 9.6.  Thanks  
for improving and committing the patch!

One thing I'd like to discuss is GetUserMappingById/GetUserMappingId.  
Though those functions aren't used in any places, I didn't take them  
out, because I thought somebody else would need them someday.  But  
considering that user mapping OIDs now aren't available in RelOptInfos,  
at least the former might be rather useless.  Should we keep them in core?

Best regards,
Etsuro Fujita

Re: Oddity in handling of cached plans for FDW queries

From

Etsuro Fujita

Date:

22 July 2016, 03:47:05

On 2016/07/21 16:30, Etsuro Fujita wrote:
> One thing I'd like to discuss is GetUserMappingById/GetUserMappingId.
> Though those functions aren't used in any places, I didn't take them
> out, because I thought somebody else would need them someday.  But
> considering that user mapping OIDs now aren't available in RelOptInfos,
> at least the former might be rather useless.  Should we keep them in core?

I thought a bit more about this.  ISTM that there isn't much value in
the latter either, because we can use GetUserMapping and get that OID
from the result, instead of using the latter, so I'd vote for removing
those functions.  Attached is a patch for that.

Best regards,
Etsuro Fujita

Attachment

remove-usermapping-helper-funcs.patch

Re: Oddity in handling of cached plans for FDW queries

From

Tom Lane

Date:

22 July 2016, 15:33:11

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> On 2016/07/21 16:30, Etsuro Fujita wrote:
>> One thing I'd like to discuss is GetUserMappingById/GetUserMappingId.
>> Though those functions aren't used in any places, I didn't take them
>> out, because I thought somebody else would need them someday.  But
>> considering that user mapping OIDs now aren't available in RelOptInfos,
>> at least the former might be rather useless.  Should we keep them in core?

> I thought a bit more about this.  ISTM that there isn't much value in  
> the latter either, because we can use GetUserMapping and get that OID  
> from the result, instead of using the latter, so I'd vote for removing  
> those functions.  Attached is a patch for that.

Agreed - pushed.
        regards, tom lane