Thread: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15781 Logged by: Sean Johnston Email address: sean.johnston@edgeintelligence.com PostgreSQL version: 11.2 Operating system: docker image postgres:11.2 Description: Example Query: select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select max(c1) from ft4); Full Steps (modified from the postgres_fdw regression tests): CREATE EXTENSION postgres_fdw; CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw; DO $d$ BEGIN EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$' )$$; END; $d$; CREATE USER MAPPING FOR CURRENT_USER SERVER loopback; CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, c3 text, CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, c3 text ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select max(c1) from ft4);
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
Sergei Kornilov
Date:
Hi I can reproduce this on REL_11_STABLE and HEAD. Here is backtrace from REL_11_STABLE: #0 CheckVarSlotCompatibility (slot=slot@entry=0x0, attnum=1, vartype=16) at execExprInterp.c:1867 #1 0x00005611db3cb342 in CheckExprStillValid (state=state@entry=0x5611dd0fa368, econtext=econtext@entry=0x5611dd0f9730)at execExprInterp.c:1831 #2 0x00005611db3cb36e in ExecInterpExprStillValid (state=0x5611dd0fa368, econtext=0x5611dd0f9730, isNull=0x7ffc524ca89f)at execExprInterp.c:1780 #3 0x00007fc3648bac8d in ExecEvalExpr (isNull=0x7ffc524ca89f, econtext=0x5611dd0f9730, state=<optimized out>) at ../../src/include/executor/executor.h:294 #4 process_query_params (econtext=0x5611dd0f9730, param_flinfo=0x5611dd0fa2d0, param_exprs=<optimized out>, param_values=param_values@entry=0x5611dd0fad50) at postgres_fdw.c:4124 #5 0x00007fc3648baf82 in create_cursor (node=<optimized out>) at postgres_fdw.c:3148 #6 0x00007fc3648bb041 in postgresIterateForeignScan (node=0x5611dd0f9618) at postgres_fdw.c:1451 #7 0x00005611db4026c4 in ForeignNext (node=node@entry=0x5611dd0f9618) at nodeForeignscan.c:54 #8 0x00005611db3db4ff in ExecScanFetch (recheckMtd=0x5611db40256e <ForeignRecheck>, accessMtd=0x5611db402650 <ForeignNext>,node=0x5611dd0f9618) at execScan.c:95 #9 ExecScan (node=0x5611dd0f9618, accessMtd=accessMtd@entry=0x5611db402650 <ForeignNext>, recheckMtd=recheckMtd@entry=0x5611db40256e <ForeignRecheck>) at execScan.c:145 #10 0x00005611db40254d in ExecForeignScan (pstate=<optimized out>) at nodeForeignscan.c:121 #11 0x00005611db3d9aa2 in ExecProcNodeFirst (node=0x5611dd0f9618) at execProcnode.c:445 #12 0x00005611db3d2039 in ExecProcNode (node=0x5611dd0f9618) at ../../../src/include/executor/executor.h:247 #13 ExecutePlan (estate=estate@entry=0x5611dd0b2718, planstate=0x5611dd0f9618, use_parallel_mode=<optimized out>, operation=operation@entry=CMD_SELECT, sendTuples=sendTuples@entry=true, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection, dest=0x5611dd0df520, execute_once=true) at execMain.c:1723 #14 0x00005611db3d2c94 in standard_ExecutorRun (queryDesc=0x5611dd0c7be8, direction=ForwardScanDirection, count=0, execute_once=<optimizedout>) at execMain.c:364 #15 0x00005611db3d2d4f in ExecutorRun (queryDesc=queryDesc@entry=0x5611dd0c7be8, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=<optimized out>) at execMain.c:307 #16 0x00005611db53f0ed in PortalRunSelect (portal=portal@entry=0x5611dd054278, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x5611dd0df520) at pquery.c:932 #17 0x00005611db5407de in PortalRun (portal=portal@entry=0x5611dd054278, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x5611dd0df520, altdest=altdest@entry=0x5611dd0df520, completionTag=0x7ffc524cad10"") at pquery.c:773 #18 0x00005611db53caa9 in exec_simple_query ( query_string=query_string@entry=0x5611dcfedac8 "select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select\nmax(c1)from ft4);") at postgres.c:1145 #19 0x00005611db53e9ce in PostgresMain (argc=<optimized out>, argv=argv@entry=0x5611dd018910, dbname=<optimized out>, username=<optimizedout>) at postgres.c:4182 #20 0x00005611db4b8d8b in BackendRun (port=port@entry=0x5611dd0115a0) at postmaster.c:4358 #21 0x00005611db4bbd2f in BackendStartup (port=port@entry=0x5611dd0115a0) at postmaster.c:4030 #22 0x00005611db4bbf52 in ServerLoop () at postmaster.c:1707 #23 0x00005611db4bd459 in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1380 #24 0x00005611db4210c9 in main (argc=3, argv=0x5611dcfe81f0) at main.c:228 Similar from HEAD: #0 CheckVarSlotCompatibility (slot=slot@entry=0x0, attnum=1, vartype=16) at execExprInterp.c:1850 #1 0x00005581fa6011b7 in CheckExprStillValid (state=state@entry=0x5581fba700c0, econtext=econtext@entry=0x5581fba6f4f0)at execExprInterp.c:1814 #2 0x00005581fa6011e3 in ExecInterpExprStillValid (state=0x5581fba700c0, econtext=0x5581fba6f4f0, isNull=0x7ffcad499ebf)at execExprInterp.c:1763 #3 0x00007f276130d67c in ExecEvalExpr (isNull=0x7ffcad499ebf, econtext=0x5581fba6f4f0, state=<optimized out>) at ../../src/include/executor/executor.h:288 #4 process_query_params (econtext=0x5581fba6f4f0, param_flinfo=0x5581fba70028, param_exprs=<optimized out>, param_values=param_values@entry=0x5581fba70aa8) at postgres_fdw.c:4307 #5 0x00007f276130d982 in create_cursor (node=<optimized out>) at postgres_fdw.c:3247 #6 0x00007f276130da3c in postgresIterateForeignScan (node=0x5581fba6f3d8) at postgres_fdw.c:1517 #7 0x00005581fa63adad in ForeignNext (node=node@entry=0x5581fba6f3d8) at nodeForeignscan.c:54 #8 0x00005581fa61104b in ExecScanFetch (recheckMtd=0x5581fa63adf1 <ForeignRecheck>, accessMtd=0x5581fa63ad2c <ForeignNext>,node=0x5581fba6f3d8) at execScan.c:93 #9 ExecScan (node=0x5581fba6f3d8, accessMtd=accessMtd@entry=0x5581fa63ad2c <ForeignNext>, recheckMtd=recheckMtd@entry=0x5581fa63adf1 <ForeignRecheck>) at execScan.c:143 #10 0x00005581fa63add0 in ExecForeignScan (pstate=<optimized out>) at nodeForeignscan.c:115 #11 0x00005581fa60f3e8 in ExecProcNodeFirst (node=0x5581fba6f3d8) at execProcnode.c:445 #12 0x00005581fa607fdd in ExecProcNode (node=0x5581fba6f3d8) at ../../../src/include/executor/executor.h:239 #13 ExecutePlan (estate=estate@entry=0x5581fba2abb8, planstate=0x5581fba6f3d8, use_parallel_mode=<optimized out>, operation=operation@entry=CMD_SELECT, sendTuples=sendTuples@entry=true, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection, dest=0x5581fba5cac0, execute_once=true) at execMain.c:1648 #14 0x00005581fa608c2a in standard_ExecutorRun (queryDesc=0x5581fba207f8, direction=ForwardScanDirection, count=0, execute_once=<optimizedout>) at execMain.c:365 #15 0x00005581fa608ce5 in ExecutorRun (queryDesc=queryDesc@entry=0x5581fba207f8, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=<optimized out>) at execMain.c:309 #16 0x00005581fa782d65 in PortalRunSelect (portal=portal@entry=0x5581fb9bb168, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x5581fba5cac0) at pquery.c:929 #17 0x00005581fa78442c in PortalRun (portal=portal@entry=0x5581fb9bb168, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x5581fba5cac0, altdest=altdest@entry=0x5581fba5cac0, completionTag=0x7ffcad49a330"") at pquery.c:770 #18 0x00005581fa780755 in exec_simple_query ( query_string=query_string@entry=0x5581fb955ac8 "select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select\nmax(c1)from ft4);") at postgres.c:1215 #19 0x00005581fa78263d in PostgresMain (argc=<optimized out>, argv=argv@entry=0x5581fb981310, dbname=<optimized out>, username=<optimizedout>) at postgres.c:4249 #20 0x00005581fa6f7979 in BackendRun (port=port@entry=0x5581fb978d20) at postmaster.c:4426 #21 0x00005581fa6faa98 in BackendStartup (port=port@entry=0x5581fb978d20) at postmaster.c:4117 #22 0x00005581fa6facbb in ServerLoop () at postmaster.c:1704 #23 0x00005581fa6fc1fc in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1377 #24 0x00005581fa65acf1 in main (argc=3, argv=0x5581fb9501f0) at main.c:228 regards, Sergei
PG Bug reporting form <noreply@postgresql.org> writes: > [ this crashes if ft4 is a postgres_fdw foreign table: ] > select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select > max(c1) from ft4); Hm, the max() subquery isn't necessary, this is sufficient: select exists(select c1 from ft4), avg(c1) from ft4 where c1 = 42; That produces a plan like QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=33) Output: ($0), (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 432)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) Now one's first observation about that is that it's kind of dumb to send the result of the locally-executed InitPlan over to the far end only to read it back. So maybe we should be thinking about how to avoid that. We do avoid it for plain foreign scans: regression=# explain verbose select exists(select c1 from ft4), * from ft4 where c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan on public.ft4 (cost=200.03..226.15 rows=6 width=41) Output: $0, ft4.c1, ft4.c2, ft4.c3 Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (6 rows) and also for foreign joins: regression=# explain verbose select exists(select c1 from ft4), * from ft4, ft4 ft4b where ft4.c1 = 42 and ft4b.c1 = 43; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Foreign Scan (cost=200.03..252.93 rows=36 width=81) Output: $0, ft4.c1, ft4.c2, ft4.c3, ft4b.c1, ft4b.c2, ft4b.c3 Relations: (public.ft4) INNER JOIN (public.ft4 ft4b) Remote SQL: SELECT r1.c1, r1.c2, r1.c3, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 3" r2 ON (((r2.c1= 43)) AND ((r1.c1 = 42)))) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) but the code for upper-relation scans is apparently stupider than either of those cases. The proximate cause of the crash is that we have {PARAM 1} (representing the output of the InitPlan) in the path's fdw_exprs, and also the identical expression in fdw_scan_tlist, and that means that when setrefs.c processes the ForeignScan node it thinks it should replace the {PARAM 1} in fdw_exprs with a Var representing a reference to the fdw_scan_tlist entry. That would be fine if the fdw_exprs represented expressions to be evaluated over the output of the foreign scan, but of course they don't --- postgres_fdw uses fdw_exprs to compute values to be sent to the remote end, instead. So we crash at runtime because there's no slot to supply such output to the fdw_exprs. I was able to make the crash go away by removing this statement from set_foreignscan_references: fscan->fdw_exprs = (List *) fix_upper_expr(root, (Node *) fscan->fdw_exprs, itlist, INDEX_VAR, rtoffset); and we still pass check-world without that (which means we lack test coverage, because the minimum that should happen to fdw_exprs is fix_scan_list :-(). But I do not think that's an acceptable route to a patch, because it amounts to having the core code know what the FDW is using fdw_exprs for, and we explicitly disclaim any assumptions about that. fdw_exprs is specified to be processed the same as other expressions in the same plan node, so I think this fix_upper_expr call probably ought to stay like it is, even though it's not really the right thing for postgres_fdw. It might be the right thing for other FDWs. (One could imagine, perhaps, having some flag in the ForeignPlan node that tells setrefs.c what to do. But that would be an API break for FDWs, so it wouldn't be a back-patchable solution.) (Actually, it seems to me that set_foreignscan_references is *already* assuming too much about the semantics of these expressions in upper plan nodes, so maybe we need to have a chat about that anyway.) If we do leave it like this, then the only way for postgres_fdw to avoid trouble is to not have any entries in fdw_exprs that exactly match entries in fdw_scan_tlist. So that pretty much devolves back to what I said before: don't ship values to the far end that are just going to be fed back as-is. But now it's a correctness requirement not just an optimization. I haven't had anything to do with postgres_fdw's upper-relation-pushdown code, so I am not sure why it's stupider than the other cases. Thoughts anybody? regards, tom lane
PG Bug reporting form <noreply@postgresql.org> writes: > [ this crashes if ft4 is a postgres_fdw foreign table: ] > select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select > max(c1) from ft4); Hm, the max() subquery isn't necessary, this is sufficient: select exists(select c1 from ft4), avg(c1) from ft4 where c1 = 42; That produces a plan like QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=33) Output: ($0), (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 432)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) Now one's first observation about that is that it's kind of dumb to send the result of the locally-executed InitPlan over to the far end only to read it back. So maybe we should be thinking about how to avoid that. We do avoid it for plain foreign scans: regression=# explain verbose select exists(select c1 from ft4), * from ft4 where c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan on public.ft4 (cost=200.03..226.15 rows=6 width=41) Output: $0, ft4.c1, ft4.c2, ft4.c3 Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (6 rows) and also for foreign joins: regression=# explain verbose select exists(select c1 from ft4), * from ft4, ft4 ft4b where ft4.c1 = 42 and ft4b.c1 = 43; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Foreign Scan (cost=200.03..252.93 rows=36 width=81) Output: $0, ft4.c1, ft4.c2, ft4.c3, ft4b.c1, ft4b.c2, ft4b.c3 Relations: (public.ft4) INNER JOIN (public.ft4 ft4b) Remote SQL: SELECT r1.c1, r1.c2, r1.c3, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 3" r2 ON (((r2.c1= 43)) AND ((r1.c1 = 42)))) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) but the code for upper-relation scans is apparently stupider than either of those cases. The proximate cause of the crash is that we have {PARAM 1} (representing the output of the InitPlan) in the path's fdw_exprs, and also the identical expression in fdw_scan_tlist, and that means that when setrefs.c processes the ForeignScan node it thinks it should replace the {PARAM 1} in fdw_exprs with a Var representing a reference to the fdw_scan_tlist entry. That would be fine if the fdw_exprs represented expressions to be evaluated over the output of the foreign scan, but of course they don't --- postgres_fdw uses fdw_exprs to compute values to be sent to the remote end, instead. So we crash at runtime because there's no slot to supply such output to the fdw_exprs. I was able to make the crash go away by removing this statement from set_foreignscan_references: fscan->fdw_exprs = (List *) fix_upper_expr(root, (Node *) fscan->fdw_exprs, itlist, INDEX_VAR, rtoffset); and we still pass check-world without that (which means we lack test coverage, because the minimum that should happen to fdw_exprs is fix_scan_list :-(). But I do not think that's an acceptable route to a patch, because it amounts to having the core code know what the FDW is using fdw_exprs for, and we explicitly disclaim any assumptions about that. fdw_exprs is specified to be processed the same as other expressions in the same plan node, so I think this fix_upper_expr call probably ought to stay like it is, even though it's not really the right thing for postgres_fdw. It might be the right thing for other FDWs. (One could imagine, perhaps, having some flag in the ForeignPlan node that tells setrefs.c what to do. But that would be an API break for FDWs, so it wouldn't be a back-patchable solution.) (Actually, it seems to me that set_foreignscan_references is *already* assuming too much about the semantics of these expressions in upper plan nodes, so maybe we need to have a chat about that anyway.) If we do leave it like this, then the only way for postgres_fdw to avoid trouble is to not have any entries in fdw_exprs that exactly match entries in fdw_scan_tlist. So that pretty much devolves back to what I said before: don't ship values to the far end that are just going to be fed back as-is. But now it's a correctness requirement not just an optimization. I haven't had anything to do with postgres_fdw's upper-relation-pushdown code, so I am not sure why it's stupider than the other cases. Thoughts anybody? regards, tom lane
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
Dmitry Dolgov
Date:
> On Thu, Apr 25, 2019 at 8:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The proximate cause of the crash is that we have {PARAM 1} > (representing the output of the InitPlan) in the path's fdw_exprs, and > also the identical expression in fdw_scan_tlist, and that means that when > setrefs.c processes the ForeignScan node it thinks it should replace the > {PARAM 1} in fdw_exprs with a Var representing a reference to the > fdw_scan_tlist entry. I've noticed, that it behaves like that since f9f63ed1f2e5 (originally I found it pretty strange, but after this explanation it does make sense). As an experiment, I've changed the position of condition of if (context->subplan_itlist->has_non_vars) back - it also made problem to disappear, and what was interesting is that the test case for update (exactly what this commit was fixing) is not crashing either. I've checked on the commit right before f9f63ed1f2e5, without mentioned reordering there is a crash, but I couldn't reproduce it on the master.
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
Dmitry Dolgov
Date:
> On Thu, Apr 25, 2019 at 8:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The proximate cause of the crash is that we have {PARAM 1} > (representing the output of the InitPlan) in the path's fdw_exprs, and > also the identical expression in fdw_scan_tlist, and that means that when > setrefs.c processes the ForeignScan node it thinks it should replace the > {PARAM 1} in fdw_exprs with a Var representing a reference to the > fdw_scan_tlist entry. I've noticed, that it behaves like that since f9f63ed1f2e5 (originally I found it pretty strange, but after this explanation it does make sense). As an experiment, I've changed the position of condition of if (context->subplan_itlist->has_non_vars) back - it also made problem to disappear, and what was interesting is that the test case for update (exactly what this commit was fixing) is not crashing either. I've checked on the commit right before f9f63ed1f2e5, without mentioned reordering there is a crash, but I couldn't reproduce it on the master.
Dmitry Dolgov <9erthalion6@gmail.com> writes: >> On Thu, Apr 25, 2019 at 8:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The proximate cause of the crash is that we have {PARAM 1} >> (representing the output of the InitPlan) in the path's fdw_exprs, and >> also the identical expression in fdw_scan_tlist, and that means that when >> setrefs.c processes the ForeignScan node it thinks it should replace the >> {PARAM 1} in fdw_exprs with a Var representing a reference to the >> fdw_scan_tlist entry. > I've noticed, that it behaves like that since f9f63ed1f2e5 (originally I found > it pretty strange, but after this explanation it does make sense). As an > experiment, I've changed the position of condition of > if (context->subplan_itlist->has_non_vars) > back - it also made problem to disappear, Well, that's just coincidental for the case where the problem fdw_expr is a Param. I haven't tried to figure out exactly what upper-path generation thinks it should put into fdw_exprs, but is it really only Params? Anyway, ideally we'd not have any entries in fdw_scan_tlist that don't include at least one foreign Var, and then there can't be a false match. regards, tom lane
Dmitry Dolgov <9erthalion6@gmail.com> writes: >> On Thu, Apr 25, 2019 at 8:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The proximate cause of the crash is that we have {PARAM 1} >> (representing the output of the InitPlan) in the path's fdw_exprs, and >> also the identical expression in fdw_scan_tlist, and that means that when >> setrefs.c processes the ForeignScan node it thinks it should replace the >> {PARAM 1} in fdw_exprs with a Var representing a reference to the >> fdw_scan_tlist entry. > I've noticed, that it behaves like that since f9f63ed1f2e5 (originally I found > it pretty strange, but after this explanation it does make sense). As an > experiment, I've changed the position of condition of > if (context->subplan_itlist->has_non_vars) > back - it also made problem to disappear, Well, that's just coincidental for the case where the problem fdw_expr is a Param. I haven't tried to figure out exactly what upper-path generation thinks it should put into fdw_exprs, but is it really only Params? Anyway, ideally we'd not have any entries in fdw_scan_tlist that don't include at least one foreign Var, and then there can't be a false match. regards, tom lane
I wrote: > Well, that's just coincidental for the case where the problem fdw_expr is > a Param. I haven't tried to figure out exactly what upper-path generation > thinks it should put into fdw_exprs, but is it really only Params? Oh, this is interesting: regression=# explain verbose select exists(select c1 from ft4) as c, avg(c1) from ft4 where ft4.c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=33) Output: ($0), (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) That would crash if I tried to execute it, but: regression=# explain verbose select case when exists(select c1 from ft4) then 1 else 2 end as c, avg(c1) from ft4 where ft4.c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=36) Output: CASE WHEN $0 THEN 1 ELSE 2 END, (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) That's just fine. So there is something stupid happening in creation of the fdw_scan_tlist when a relation tlist item is a bare Param, which doesn't happen if the same Param is buried in a larger expression. regards, tom lane
I wrote: > Well, that's just coincidental for the case where the problem fdw_expr is > a Param. I haven't tried to figure out exactly what upper-path generation > thinks it should put into fdw_exprs, but is it really only Params? Oh, this is interesting: regression=# explain verbose select exists(select c1 from ft4) as c, avg(c1) from ft4 where ft4.c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=33) Output: ($0), (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) That would crash if I tried to execute it, but: regression=# explain verbose select case when exists(select c1 from ft4) then 1 else 2 end as c, avg(c1) from ft4 where ft4.c1 = 42; QUERY PLAN ----------------------------------------------------------------------------------- Foreign Scan (cost=200.07..246.67 rows=1 width=36) Output: CASE WHEN $0 THEN 1 ELSE 2 END, (avg(ft4.c1)) Relations: Aggregate on (public.ft4) Remote SQL: SELECT avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 42)) InitPlan 1 (returns $0) -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) Remote SQL: SELECT NULL FROM "S 1"."T 3" (7 rows) That's just fine. So there is something stupid happening in creation of the fdw_scan_tlist when a relation tlist item is a bare Param, which doesn't happen if the same Param is buried in a larger expression. regards, tom lane
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash(segfault)
From
Etsuro Fujita
Date:
(2019/04/26 3:24), Tom Lane wrote: > PG Bug reporting form<noreply@postgresql.org> writes: >> [ this crashes if ft4 is a postgres_fdw foreign table: ] >> select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select >> max(c1) from ft4); > > Hm, the max() subquery isn't necessary, this is sufficient: > > select exists(select c1 from ft4), avg(c1) from ft4 where c1 = 42; > > That produces a plan like > > QUERY PLAN > ----------------------------------------------------------------------------------- > Foreign Scan (cost=200.07..246.67 rows=1 width=33) > Output: ($0), (avg(ft4.c1)) > Relations: Aggregate on (public.ft4) > Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 432)) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (7 rows) > > Now one's first observation about that is that it's kind of dumb to send > the result of the locally-executed InitPlan over to the far end only to > read it back. So maybe we should be thinking about how to avoid that. > We do avoid it for plain foreign scans: > > regression=# explain verbose > select exists(select c1 from ft4), * from ft4 where c1 = 42; > QUERY PLAN > ----------------------------------------------------------------------------------- > Foreign Scan on public.ft4 (cost=200.03..226.15 rows=6 width=41) > Output: $0, ft4.c1, ft4.c2, ft4.c3 > Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 = 42)) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (6 rows) > > and also for foreign joins: > > regression=# explain verbose > select exists(select c1 from ft4), * from ft4, ft4 ft4b where ft4.c1 = 42 and ft4b.c1 = 43; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------ > Foreign Scan (cost=200.03..252.93 rows=36 width=81) > Output: $0, ft4.c1, ft4.c2, ft4.c3, ft4b.c1, ft4b.c2, ft4b.c3 > Relations: (public.ft4) INNER JOIN (public.ft4 ft4b) > Remote SQL: SELECT r1.c1, r1.c2, r1.c3, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 3" r2 ON (((r2.c1= 43)) AND ((r1.c1 = 42)))) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (7 rows) > > > but the code for upper-relation scans is apparently stupider than either > of those cases. > > The proximate cause of the crash is that we have {PARAM 1} > (representing the output of the InitPlan) in the path's fdw_exprs, and > also the identical expression in fdw_scan_tlist, and that means that when > setrefs.c processes the ForeignScan node it thinks it should replace the > {PARAM 1} in fdw_exprs with a Var representing a reference to the > fdw_scan_tlist entry. That would be fine if the fdw_exprs represented > expressions to be evaluated over the output of the foreign scan, but of > course they don't --- postgres_fdw uses fdw_exprs to compute values to be > sent to the remote end, instead. So we crash at runtime because there's > no slot to supply such output to the fdw_exprs. > > I was able to make the crash go away by removing this statement from > set_foreignscan_references: > > fscan->fdw_exprs = (List *) > fix_upper_expr(root, > (Node *) fscan->fdw_exprs, > itlist, > INDEX_VAR, > rtoffset); > > and we still pass check-world without that (which means we lack test > coverage, because the minimum that should happen to fdw_exprs is > fix_scan_list :-(). But I do not think that's an acceptable route to > a patch, because it amounts to having the core code know what the FDW > is using fdw_exprs for, and we explicitly disclaim any assumptions about > that. fdw_exprs is specified to be processed the same as other > expressions in the same plan node, so I think this fix_upper_expr call > probably ought to stay like it is, even though it's not really the right > thing for postgres_fdw. It might be the right thing for other FDWs. > > (One could imagine, perhaps, having some flag in the ForeignPlan > node that tells setrefs.c what to do. But that would be an API break > for FDWs, so it wouldn't be a back-patchable solution.) > > (Actually, it seems to me that set_foreignscan_references is *already* > assuming too much about the semantics of these expressions in upper > plan nodes, so maybe we need to have a chat about that anyway.) > > If we do leave it like this, then the only way for postgres_fdw to > avoid trouble is to not have any entries in fdw_exprs that exactly > match entries in fdw_scan_tlist. So that pretty much devolves back > to what I said before: don't ship values to the far end that are > just going to be fed back as-is. But now it's a correctness > requirement not just an optimization. Thanks for taking care of this, as usual! > I haven't had anything to do with postgres_fdw's upper-relation-pushdown > code, so I am not sure why it's stupider than the other cases. > Thoughts anybody? I worked on the ORDERED/FINAL-upperrel pushdown for PG12, but I don't think that that's directly related to this issue, because this arises in PG11 already. Maybe I'm missing something, but the UPPERREL_GROUP_AGG-upperrel pushdown added in PG10 is likely to be related to this. I'll work on this issue unless somebody wants to. But I'll take a 10-day vocation from tomorrow, so I don't think I'll be able to fix this in the next minor release... Best regards, Etsuro Fujita
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash(segfault)
From
Etsuro Fujita
Date:
(2019/04/26 3:24), Tom Lane wrote: > PG Bug reporting form<noreply@postgresql.org> writes: >> [ this crashes if ft4 is a postgres_fdw foreign table: ] >> select exists(select c1 from ft4), avg(c1) from ft4 where c1 = (select >> max(c1) from ft4); > > Hm, the max() subquery isn't necessary, this is sufficient: > > select exists(select c1 from ft4), avg(c1) from ft4 where c1 = 42; > > That produces a plan like > > QUERY PLAN > ----------------------------------------------------------------------------------- > Foreign Scan (cost=200.07..246.67 rows=1 width=33) > Output: ($0), (avg(ft4.c1)) > Relations: Aggregate on (public.ft4) > Remote SQL: SELECT $1::boolean, avg(c1) FROM "S 1"."T 3" WHERE ((c1 = 432)) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (7 rows) > > Now one's first observation about that is that it's kind of dumb to send > the result of the locally-executed InitPlan over to the far end only to > read it back. So maybe we should be thinking about how to avoid that. > We do avoid it for plain foreign scans: > > regression=# explain verbose > select exists(select c1 from ft4), * from ft4 where c1 = 42; > QUERY PLAN > ----------------------------------------------------------------------------------- > Foreign Scan on public.ft4 (cost=200.03..226.15 rows=6 width=41) > Output: $0, ft4.c1, ft4.c2, ft4.c3 > Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 = 42)) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (6 rows) > > and also for foreign joins: > > regression=# explain verbose > select exists(select c1 from ft4), * from ft4, ft4 ft4b where ft4.c1 = 42 and ft4b.c1 = 43; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------ > Foreign Scan (cost=200.03..252.93 rows=36 width=81) > Output: $0, ft4.c1, ft4.c2, ft4.c3, ft4b.c1, ft4b.c2, ft4b.c3 > Relations: (public.ft4) INNER JOIN (public.ft4 ft4b) > Remote SQL: SELECT r1.c1, r1.c2, r1.c3, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 3" r2 ON (((r2.c1= 43)) AND ((r1.c1 = 42)))) > InitPlan 1 (returns $0) > -> Foreign Scan on public.ft4 ft4_1 (cost=100.00..212.39 rows=3413 width=0) > Remote SQL: SELECT NULL FROM "S 1"."T 3" > (7 rows) > > > but the code for upper-relation scans is apparently stupider than either > of those cases. > > The proximate cause of the crash is that we have {PARAM 1} > (representing the output of the InitPlan) in the path's fdw_exprs, and > also the identical expression in fdw_scan_tlist, and that means that when > setrefs.c processes the ForeignScan node it thinks it should replace the > {PARAM 1} in fdw_exprs with a Var representing a reference to the > fdw_scan_tlist entry. That would be fine if the fdw_exprs represented > expressions to be evaluated over the output of the foreign scan, but of > course they don't --- postgres_fdw uses fdw_exprs to compute values to be > sent to the remote end, instead. So we crash at runtime because there's > no slot to supply such output to the fdw_exprs. > > I was able to make the crash go away by removing this statement from > set_foreignscan_references: > > fscan->fdw_exprs = (List *) > fix_upper_expr(root, > (Node *) fscan->fdw_exprs, > itlist, > INDEX_VAR, > rtoffset); > > and we still pass check-world without that (which means we lack test > coverage, because the minimum that should happen to fdw_exprs is > fix_scan_list :-(). But I do not think that's an acceptable route to > a patch, because it amounts to having the core code know what the FDW > is using fdw_exprs for, and we explicitly disclaim any assumptions about > that. fdw_exprs is specified to be processed the same as other > expressions in the same plan node, so I think this fix_upper_expr call > probably ought to stay like it is, even though it's not really the right > thing for postgres_fdw. It might be the right thing for other FDWs. > > (One could imagine, perhaps, having some flag in the ForeignPlan > node that tells setrefs.c what to do. But that would be an API break > for FDWs, so it wouldn't be a back-patchable solution.) > > (Actually, it seems to me that set_foreignscan_references is *already* > assuming too much about the semantics of these expressions in upper > plan nodes, so maybe we need to have a chat about that anyway.) > > If we do leave it like this, then the only way for postgres_fdw to > avoid trouble is to not have any entries in fdw_exprs that exactly > match entries in fdw_scan_tlist. So that pretty much devolves back > to what I said before: don't ship values to the far end that are > just going to be fed back as-is. But now it's a correctness > requirement not just an optimization. Thanks for taking care of this, as usual! > I haven't had anything to do with postgres_fdw's upper-relation-pushdown > code, so I am not sure why it's stupider than the other cases. > Thoughts anybody? I worked on the ORDERED/FINAL-upperrel pushdown for PG12, but I don't think that that's directly related to this issue, because this arises in PG11 already. Maybe I'm missing something, but the UPPERREL_GROUP_AGG-upperrel pushdown added in PG10 is likely to be related to this. I'll work on this issue unless somebody wants to. But I'll take a 10-day vocation from tomorrow, so I don't think I'll be able to fix this in the next minor release... Best regards, Etsuro Fujita
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes: > (2019/04/26 3:24), Tom Lane wrote: >> If we do leave it like this, then the only way for postgres_fdw to >> avoid trouble is to not have any entries in fdw_exprs that exactly >> match entries in fdw_scan_tlist. So that pretty much devolves back >> to what I said before: don't ship values to the far end that are >> just going to be fed back as-is. But now it's a correctness >> requirement not just an optimization. > I worked on the ORDERED/FINAL-upperrel pushdown for PG12, but I don't > think that that's directly related to this issue, because this arises in > PG11 already. Maybe I'm missing something, but the > UPPERREL_GROUP_AGG-upperrel pushdown added in PG10 is likely to be > related to this. I'll work on this issue unless somebody wants to. But > I'll take a 10-day vocation from tomorrow, so I don't think I'll be able > to fix this in the next minor release... Well, the releases are coming up fast, so I spent some time on this. If we don't want to change what the core code does with fdw_exprs, I think the only way to fix it is to hack postgres_fdw so that it won't generate plans involving the problematic case. See attached. We end up with slightly weird-looking plans if the troublesome Param is actually a GROUP BY expression, but if it's not, I think things are fine. Maybe we could do something smarter about the GROUP BY case, but it seems weird enough to maybe not be worth additional trouble. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 079406f..c4e3311 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -842,6 +842,55 @@ foreign_expr_walker(Node *node, } /* + * Returns true if given expr is something we'd have to send the value of + * to the foreign server. + * + * This should return true when the expression is a shippable node that + * deparseExpr would add to context->params_list. Note that we don't care + * if the expression *contains* such a node, only whether one appears at top + * level. We need this to detect cases where setrefs.c would recognize a + * false match between an fdw_exprs item (which came from the params_list) + * and an entry in fdw_scan_tlist (which we're considering putting the given + * expression into). + */ +bool +is_foreign_param(PlannerInfo *root, + RelOptInfo *baserel, + Expr *expr) +{ + if (expr == NULL) + return false; + + switch (nodeTag(expr)) + { + case T_Var: + { + /* It would have to be sent unless it's a foreign Var */ + Var *var = (Var *) expr; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) (baserel->fdw_private); + Relids relids; + + if (IS_UPPER_REL(baserel)) + relids = fpinfo->outerrel->relids; + else + relids = baserel->relids; + + if (bms_is_member(var->varno, relids) && var->varlevelsup == 0) + return false; /* foreign Var, so not a param */ + else + return true; /* it'd have to be a param */ + break; + } + case T_Param: + /* Params always have to be sent to the foreign server */ + return true; + default: + break; + } + return false; +} + +/* * Convert type OID + typmod info into a type name we can ship to the remote * server. Someplace else had better have verified that this type name is * expected to be known on the remote end. diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 48ea67a..e034b03 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -2707,6 +2707,46 @@ select sum(c1) from ft1 group by c2 having avg(c1 * (random() <= 1)::int) > 100 Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" (10 rows) +-- Remote aggregate in combination with a local Param (for the output +-- of an initplan) can be trouble, per bug #15781 +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1; + QUERY PLAN +-------------------------------------------------- + Foreign Scan + Output: $0, (sum(ft1.c1)) + Relations: Aggregate on (public.ft1) + Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1" + InitPlan 1 (returns $0) + -> Seq Scan on pg_catalog.pg_enum +(6 rows) + +select exists(select 1 from pg_enum), sum(c1) from ft1; + exists | sum +--------+-------- + t | 500500 +(1 row) + +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + QUERY PLAN +--------------------------------------------------- + GroupAggregate + Output: ($0), sum(ft1.c1) + Group Key: $0 + InitPlan 1 (returns $0) + -> Seq Scan on pg_catalog.pg_enum + -> Foreign Scan on public.ft1 + Output: $0, ft1.c1 + Remote SQL: SELECT "C 1" FROM "S 1"."T 1" +(8 rows) + +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + exists | sum +--------+-------- + t | 500500 +(1 row) + -- Testing ORDER BY, DISTINCT, FILTER, Ordered-sets and VARIADIC within aggregates -- ORDER BY within aggregate, same column used to order explain (verbose, costs off) diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 794363c..6c9e320 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5486,7 +5486,6 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) grouped_rel->fdw_private; PathTarget *grouping_target = grouped_rel->reltarget; PgFdwRelationInfo *ofpinfo; - List *aggvars; ListCell *lc; int i; List *tlist = NIL; @@ -5512,6 +5511,15 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, * server. All GROUP BY expressions will be part of the grouping target * and thus there is no need to search for them separately. Add grouping * expressions into target list which will be passed to foreign server. + * + * A tricky fine point is that we must not put any expression into the + * target list that is just a foreign param (that is, something that + * deparse.c would conclude has to be sent to the foreign server). If we + * do, the expression will also appear in the fdw_exprs list of the plan + * node, and setrefs.c will get confused and decide that the fdw_exprs + * entry is actually a reference to the fdw_scan_tlist entry, resulting in + * a broken plan. Somewhat oddly, it's OK if the expression contains such + * a node, as long as it's not at top level; then no match is possible. */ i = 0; foreach(lc, grouping_target->exprs) @@ -5533,6 +5541,13 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, return false; /* + * If it would be a foreign param, we can't put it into the tlist, + * so we have to fail. + */ + if (is_foreign_param(root, grouped_rel, expr)) + return false; + + /* * Pushable, so add to tlist. We need to create a TLE for this * expression and apply the sortgroupref to it. We cannot use * add_to_flat_tlist() here because that avoids making duplicate @@ -5547,9 +5562,11 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, else { /* - * Non-grouping expression we need to compute. Is it shippable? + * Non-grouping expression we need to compute. Can we ship it + * as-is to the foreign server? */ - if (is_foreign_expr(root, grouped_rel, expr)) + if (is_foreign_expr(root, grouped_rel, expr) && + !is_foreign_param(root, grouped_rel, expr)) { /* Yes, so add to tlist as-is; OK to suppress duplicates */ tlist = add_to_flat_tlist(tlist, list_make1(expr)); @@ -5557,12 +5574,16 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, else { /* Not pushable as a whole; extract its Vars and aggregates */ + List *aggvars; + aggvars = pull_var_clause((Node *) expr, PVC_INCLUDE_AGGREGATES); /* * If any aggregate expression is not shippable, then we - * cannot push down aggregation to the foreign server. + * cannot push down aggregation to the foreign server. (We + * don't have to check is_foreign_param, since that certainly + * won't return true for any such expression.) */ if (!is_foreign_expr(root, grouped_rel, (Expr *) aggvars)) return false; @@ -5649,7 +5670,8 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, * If aggregates within local conditions are not safe to push * down, then we cannot push down the query. Vars are already * part of GROUP BY clause which are checked above, so no need to - * access them again here. + * access them again here. Again, we need not check + * is_foreign_param for a foreign aggregate. */ if (IsA(expr, Aggref)) { diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h index b382945..3e4603d 100644 --- a/contrib/postgres_fdw/postgres_fdw.h +++ b/contrib/postgres_fdw/postgres_fdw.h @@ -146,6 +146,9 @@ extern void classifyConditions(PlannerInfo *root, extern bool is_foreign_expr(PlannerInfo *root, RelOptInfo *baserel, Expr *expr); +extern bool is_foreign_param(PlannerInfo *root, + RelOptInfo *baserel, + Expr *expr); extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte, Index rtindex, Relation rel, List *targetAttrs, bool doNothing, diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 127cd30..73852f1 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -685,6 +685,16 @@ select count(*) from (select c5, count(c1) from ft1 group by c5, sqrt(c2) having explain (verbose, costs off) select sum(c1) from ft1 group by c2 having avg(c1 * (random() <= 1)::int) > 100 order by 1; +-- Remote aggregate in combination with a local Param (for the output +-- of an initplan) can be trouble, per bug #15781 +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1; +select exists(select 1 from pg_enum), sum(c1) from ft1; + +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + -- Testing ORDER BY, DISTINCT, FILTER, Ordered-sets and VARIADIC within aggregates
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes: > (2019/04/26 3:24), Tom Lane wrote: >> If we do leave it like this, then the only way for postgres_fdw to >> avoid trouble is to not have any entries in fdw_exprs that exactly >> match entries in fdw_scan_tlist. So that pretty much devolves back >> to what I said before: don't ship values to the far end that are >> just going to be fed back as-is. But now it's a correctness >> requirement not just an optimization. > I worked on the ORDERED/FINAL-upperrel pushdown for PG12, but I don't > think that that's directly related to this issue, because this arises in > PG11 already. Maybe I'm missing something, but the > UPPERREL_GROUP_AGG-upperrel pushdown added in PG10 is likely to be > related to this. I'll work on this issue unless somebody wants to. But > I'll take a 10-day vocation from tomorrow, so I don't think I'll be able > to fix this in the next minor release... Well, the releases are coming up fast, so I spent some time on this. If we don't want to change what the core code does with fdw_exprs, I think the only way to fix it is to hack postgres_fdw so that it won't generate plans involving the problematic case. See attached. We end up with slightly weird-looking plans if the troublesome Param is actually a GROUP BY expression, but if it's not, I think things are fine. Maybe we could do something smarter about the GROUP BY case, but it seems weird enough to maybe not be worth additional trouble. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 079406f..c4e3311 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -842,6 +842,55 @@ foreign_expr_walker(Node *node, } /* + * Returns true if given expr is something we'd have to send the value of + * to the foreign server. + * + * This should return true when the expression is a shippable node that + * deparseExpr would add to context->params_list. Note that we don't care + * if the expression *contains* such a node, only whether one appears at top + * level. We need this to detect cases where setrefs.c would recognize a + * false match between an fdw_exprs item (which came from the params_list) + * and an entry in fdw_scan_tlist (which we're considering putting the given + * expression into). + */ +bool +is_foreign_param(PlannerInfo *root, + RelOptInfo *baserel, + Expr *expr) +{ + if (expr == NULL) + return false; + + switch (nodeTag(expr)) + { + case T_Var: + { + /* It would have to be sent unless it's a foreign Var */ + Var *var = (Var *) expr; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) (baserel->fdw_private); + Relids relids; + + if (IS_UPPER_REL(baserel)) + relids = fpinfo->outerrel->relids; + else + relids = baserel->relids; + + if (bms_is_member(var->varno, relids) && var->varlevelsup == 0) + return false; /* foreign Var, so not a param */ + else + return true; /* it'd have to be a param */ + break; + } + case T_Param: + /* Params always have to be sent to the foreign server */ + return true; + default: + break; + } + return false; +} + +/* * Convert type OID + typmod info into a type name we can ship to the remote * server. Someplace else had better have verified that this type name is * expected to be known on the remote end. diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 48ea67a..e034b03 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -2707,6 +2707,46 @@ select sum(c1) from ft1 group by c2 having avg(c1 * (random() <= 1)::int) > 100 Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" (10 rows) +-- Remote aggregate in combination with a local Param (for the output +-- of an initplan) can be trouble, per bug #15781 +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1; + QUERY PLAN +-------------------------------------------------- + Foreign Scan + Output: $0, (sum(ft1.c1)) + Relations: Aggregate on (public.ft1) + Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1" + InitPlan 1 (returns $0) + -> Seq Scan on pg_catalog.pg_enum +(6 rows) + +select exists(select 1 from pg_enum), sum(c1) from ft1; + exists | sum +--------+-------- + t | 500500 +(1 row) + +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + QUERY PLAN +--------------------------------------------------- + GroupAggregate + Output: ($0), sum(ft1.c1) + Group Key: $0 + InitPlan 1 (returns $0) + -> Seq Scan on pg_catalog.pg_enum + -> Foreign Scan on public.ft1 + Output: $0, ft1.c1 + Remote SQL: SELECT "C 1" FROM "S 1"."T 1" +(8 rows) + +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + exists | sum +--------+-------- + t | 500500 +(1 row) + -- Testing ORDER BY, DISTINCT, FILTER, Ordered-sets and VARIADIC within aggregates -- ORDER BY within aggregate, same column used to order explain (verbose, costs off) diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 794363c..6c9e320 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5486,7 +5486,6 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) grouped_rel->fdw_private; PathTarget *grouping_target = grouped_rel->reltarget; PgFdwRelationInfo *ofpinfo; - List *aggvars; ListCell *lc; int i; List *tlist = NIL; @@ -5512,6 +5511,15 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, * server. All GROUP BY expressions will be part of the grouping target * and thus there is no need to search for them separately. Add grouping * expressions into target list which will be passed to foreign server. + * + * A tricky fine point is that we must not put any expression into the + * target list that is just a foreign param (that is, something that + * deparse.c would conclude has to be sent to the foreign server). If we + * do, the expression will also appear in the fdw_exprs list of the plan + * node, and setrefs.c will get confused and decide that the fdw_exprs + * entry is actually a reference to the fdw_scan_tlist entry, resulting in + * a broken plan. Somewhat oddly, it's OK if the expression contains such + * a node, as long as it's not at top level; then no match is possible. */ i = 0; foreach(lc, grouping_target->exprs) @@ -5533,6 +5541,13 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, return false; /* + * If it would be a foreign param, we can't put it into the tlist, + * so we have to fail. + */ + if (is_foreign_param(root, grouped_rel, expr)) + return false; + + /* * Pushable, so add to tlist. We need to create a TLE for this * expression and apply the sortgroupref to it. We cannot use * add_to_flat_tlist() here because that avoids making duplicate @@ -5547,9 +5562,11 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, else { /* - * Non-grouping expression we need to compute. Is it shippable? + * Non-grouping expression we need to compute. Can we ship it + * as-is to the foreign server? */ - if (is_foreign_expr(root, grouped_rel, expr)) + if (is_foreign_expr(root, grouped_rel, expr) && + !is_foreign_param(root, grouped_rel, expr)) { /* Yes, so add to tlist as-is; OK to suppress duplicates */ tlist = add_to_flat_tlist(tlist, list_make1(expr)); @@ -5557,12 +5574,16 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, else { /* Not pushable as a whole; extract its Vars and aggregates */ + List *aggvars; + aggvars = pull_var_clause((Node *) expr, PVC_INCLUDE_AGGREGATES); /* * If any aggregate expression is not shippable, then we - * cannot push down aggregation to the foreign server. + * cannot push down aggregation to the foreign server. (We + * don't have to check is_foreign_param, since that certainly + * won't return true for any such expression.) */ if (!is_foreign_expr(root, grouped_rel, (Expr *) aggvars)) return false; @@ -5649,7 +5670,8 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel, * If aggregates within local conditions are not safe to push * down, then we cannot push down the query. Vars are already * part of GROUP BY clause which are checked above, so no need to - * access them again here. + * access them again here. Again, we need not check + * is_foreign_param for a foreign aggregate. */ if (IsA(expr, Aggref)) { diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h index b382945..3e4603d 100644 --- a/contrib/postgres_fdw/postgres_fdw.h +++ b/contrib/postgres_fdw/postgres_fdw.h @@ -146,6 +146,9 @@ extern void classifyConditions(PlannerInfo *root, extern bool is_foreign_expr(PlannerInfo *root, RelOptInfo *baserel, Expr *expr); +extern bool is_foreign_param(PlannerInfo *root, + RelOptInfo *baserel, + Expr *expr); extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte, Index rtindex, Relation rel, List *targetAttrs, bool doNothing, diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 127cd30..73852f1 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -685,6 +685,16 @@ select count(*) from (select c5, count(c1) from ft1 group by c5, sqrt(c2) having explain (verbose, costs off) select sum(c1) from ft1 group by c2 having avg(c1 * (random() <= 1)::int) > 100 order by 1; +-- Remote aggregate in combination with a local Param (for the output +-- of an initplan) can be trouble, per bug #15781 +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1; +select exists(select 1 from pg_enum), sum(c1) from ft1; + +explain (verbose, costs off) +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; +select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1; + -- Testing ORDER BY, DISTINCT, FILTER, Ordered-sets and VARIADIC within aggregates
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
Etsuro Fujita
Date:
On Sat, Apr 27, 2019 at 2:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > (2019/04/26 3:24), Tom Lane wrote: > >> If we do leave it like this, then the only way for postgres_fdw to > >> avoid trouble is to not have any entries in fdw_exprs that exactly > >> match entries in fdw_scan_tlist. So that pretty much devolves back > >> to what I said before: don't ship values to the far end that are > >> just going to be fed back as-is. But now it's a correctness > >> requirement not just an optimization. > Well, the releases are coming up fast, so I spent some time on this. > If we don't want to change what the core code does with fdw_exprs, > I think the only way to fix it is to hack postgres_fdw so that it > won't generate plans involving the problematic case. Seems reasonable. > See attached. I read the patch. It looks good to me. I didn't test it, though. > We end up with slightly weird-looking plans if the troublesome Param > is actually a GROUP BY expression, but if it's not, I think things > are fine. Maybe we could do something smarter about the GROUP BY case, > but it seems weird enough to maybe not be worth additional trouble. Agreed. Thanks for working on this! Best regards, Etsuro Fujita
Re: BUG #15781: subselect on foreign table (postgres_fdw) can crash (segfault)
From
Etsuro Fujita
Date:
On Sat, Apr 27, 2019 at 2:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > (2019/04/26 3:24), Tom Lane wrote: > >> If we do leave it like this, then the only way for postgres_fdw to > >> avoid trouble is to not have any entries in fdw_exprs that exactly > >> match entries in fdw_scan_tlist. So that pretty much devolves back > >> to what I said before: don't ship values to the far end that are > >> just going to be fed back as-is. But now it's a correctness > >> requirement not just an optimization. > Well, the releases are coming up fast, so I spent some time on this. > If we don't want to change what the core code does with fdw_exprs, > I think the only way to fix it is to hack postgres_fdw so that it > won't generate plans involving the problematic case. Seems reasonable. > See attached. I read the patch. It looks good to me. I didn't test it, though. > We end up with slightly weird-looking plans if the troublesome Param > is actually a GROUP BY expression, but if it's not, I think things > are fine. Maybe we could do something smarter about the GROUP BY case, > but it seems weird enough to maybe not be worth additional trouble. Agreed. Thanks for working on this! Best regards, Etsuro Fujita
Etsuro Fujita <etsuro.fujita@gmail.com> writes: > On Sat, Apr 27, 2019 at 2:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If we don't want to change what the core code does with fdw_exprs, >> I think the only way to fix it is to hack postgres_fdw so that it >> won't generate plans involving the problematic case. > Seems reasonable. >> See attached. > I read the patch. It looks good to me. I didn't test it, though. Thanks for looking! Have a good vacation ... regards, tom lane
Etsuro Fujita <etsuro.fujita@gmail.com> writes: > On Sat, Apr 27, 2019 at 2:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If we don't want to change what the core code does with fdw_exprs, >> I think the only way to fix it is to hack postgres_fdw so that it >> won't generate plans involving the problematic case. > Seems reasonable. >> See attached. > I read the patch. It looks good to me. I didn't test it, though. Thanks for looking! Have a good vacation ... regards, tom lane