Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash(segfault) - Mailing list pgsql-bugs

From Etsuro Fujita
Subject Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash(segfault)
Date
Msg-id 5CC2FC12.6050409@lab.ntt.co.jp
Whole thread Raw
In response to Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
(2019/04/26 3:24), Tom Lane wrote:
> PG Bug reporting form<noreply@postgresql.org>  writes:
>> [ this crashes if ft4 is a postgres_fdw foreign table: ]
>> select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select
>> max(c1) from ft4);
>
> Hm, the max() subquery isn't necessary, this is sufficient:
>
> select exists(select c1 from ft4), avg(c1) from ft4 where c1 = 42;
>
> That produces a plan like
>
>                                      QUERY PLAN
> -----------------------------------------------------------------------------------
>   Foreign Scan  (cost=200.07..246.67 rows=1 width=33)
>     Output: ($0), (avg(ft4.c1))
>     Relations: Aggregate on (public.ft4)
>     Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 432))
>     InitPlan 1 (returns $0)
>       ->   Foreign Scan on public.ft4 ft4_1  (cost=100.00..212.39 rows=3413 width=0)
>             Remote SQL: SELECT NULL FROM "S 1"."T 3"
> (7 rows)
>
> Now one's first observation about that is that it's kind of dumb to send
> the result of the locally-executed InitPlan over to the far end only to
> read it back.  So maybe we should be thinking about how to avoid that.
> We do avoid it for plain foreign scans:
>
> regression=# explain verbose
>   select exists(select c1 from ft4), * from ft4 where c1 = 42;
>                                      QUERY PLAN
> -----------------------------------------------------------------------------------
>   Foreign Scan on public.ft4  (cost=200.03..226.15 rows=6 width=41)
>     Output: $0, ft4.c1, ft4.c2, ft4.c3
>     Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 = 42))
>     InitPlan 1 (returns $0)
>       ->   Foreign Scan on public.ft4 ft4_1  (cost=100.00..212.39 rows=3413 width=0)
>             Remote SQL: SELECT NULL FROM "S 1"."T 3"
> (6 rows)
>
> and also for foreign joins:
>
> regression=# explain verbose
>   select exists(select c1 from ft4), * from ft4, ft4 ft4b where ft4.c1 = 42 and ft4b.c1 = 43;
>                                                                        QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------
>   Foreign Scan  (cost=200.03..252.93 rows=36 width=81)
>     Output: $0, ft4.c1, ft4.c2, ft4.c3, ft4b.c1, ft4b.c2, ft4b.c3
>     Relations: (public.ft4) INNER JOIN (public.ft4 ft4b)
>     Remote SQL: SELECT r1.c1, r1.c2, r1.c3, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 3" r2 ON
(((r2.c1= 43)) AND ((r1.c1 = 42))))
 
>     InitPlan 1 (returns $0)
>       ->   Foreign Scan on public.ft4 ft4_1  (cost=100.00..212.39 rows=3413 width=0)
>             Remote SQL: SELECT NULL FROM "S 1"."T 3"
> (7 rows)
>
>
> but the code for upper-relation scans is apparently stupider than either
> of those cases.
>
> The proximate cause of the crash is that we have {PARAM 1}
> (representing the output of the InitPlan) in the path's fdw_exprs, and
> also the identical expression in fdw_scan_tlist, and that means that when
> setrefs.c processes the ForeignScan node it thinks it should replace the
> {PARAM 1} in fdw_exprs with a Var representing a reference to the
> fdw_scan_tlist entry.  That would be fine if the fdw_exprs represented
> expressions to be evaluated over the output of the foreign scan, but of
> course they don't --- postgres_fdw uses fdw_exprs to compute values to be
> sent to the remote end, instead.  So we crash at runtime because there's
> no slot to supply such output to the fdw_exprs.
>
> I was able to make the crash go away by removing this statement from
> set_foreignscan_references:
>
>          fscan->fdw_exprs = (List *)
>              fix_upper_expr(root,
>                             (Node *) fscan->fdw_exprs,
>                             itlist,
>                             INDEX_VAR,
>                             rtoffset);
>
> and we still pass check-world without that (which means we lack test
> coverage, because the minimum that should happen to fdw_exprs is
> fix_scan_list :-().  But I do not think that's an acceptable route to
> a patch, because it amounts to having the core code know what the FDW
> is using fdw_exprs for, and we explicitly disclaim any assumptions about
> that.  fdw_exprs is specified to be processed the same as other
> expressions in the same plan node, so I think this fix_upper_expr call
> probably ought to stay like it is, even though it's not really the right
> thing for postgres_fdw.  It might be the right thing for other FDWs.
>
> (One could imagine, perhaps, having some flag in the ForeignPlan
> node that tells setrefs.c what to do.  But that would be an API break
> for FDWs, so it wouldn't be a back-patchable solution.)
>
> (Actually, it seems to me that set_foreignscan_references is *already*
> assuming too much about the semantics of these expressions in upper
> plan nodes, so maybe we need to have a chat about that anyway.)
>
> If we do leave it like this, then the only way for postgres_fdw to
> avoid trouble is to not have any entries in fdw_exprs that exactly
> match entries in fdw_scan_tlist.  So that pretty much devolves back
> to what I said before: don't ship values to the far end that are
> just going to be fed back as-is.  But now it's a correctness
> requirement not just an optimization.

Thanks for taking care of this, as usual!

> I haven't had anything to do with postgres_fdw's upper-relation-pushdown
> code, so I am not sure why it's stupider than the other cases.
> Thoughts anybody?

I worked on the ORDERED/FINAL-upperrel pushdown for PG12, but I don't 
think that that's directly related to this issue, because this arises in 
PG11 already.  Maybe I'm missing something, but the 
UPPERREL_GROUP_AGG-upperrel pushdown added in PG10 is likely to be 
related to this.  I'll work on this issue unless somebody wants to.  But 
I'll take a 10-day vocation from tomorrow, so I don't think I'll be able 
to fix this in the next minor release...

Best regards,
Etsuro Fujita




pgsql-bugs by date:

Previous
From: Flo Rance
Date:
Subject: Re: Reg: Postgresql8.3 Using on Ubuntu
Next
From: neelaveni
Date:
Subject: Re: Reg: Postgresql8.3 Using on Ubuntu