Thread: Case expression pushdown
Hi. This patch allows pushing case expressions to foreign servers, so that more types of updates could be executed directly. For example, without patch: EXPLAIN (VERBOSE, COSTS OFF) UPDATE ft2 d SET c2 = CASE WHEN c2 > 0 THEN c2 ELSE 0 END WHERE c1 > 1000; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Update on public.ft2 d Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2 WHERE ctid = $1 -> Foreign Scan on public.ft2 d Output: CASE WHEN (c2 > 0) THEN c2 ELSE 0 END, ctid, d.* Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 1000)) FOR UPDATE EXPLAIN (VERBOSE, COSTS OFF) UPDATE ft2 d SET c2 = CASE WHEN c2 > 0 THEN c2 ELSE 0 END WHERE c1 > 1000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Update on public.ft2 d -> Foreign Update on public.ft2 d Remote SQL: UPDATE "S 1"."T 1" SET c2 = (CASE WHEN (c2 > 0) THEN c2 ELSE 0 END) WHERE (("C 1" > 1000)) -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Looks quite useful to me. Can you please add this to the next commitfest? On Wed, Jun 9, 2021 at 5:25 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > > Hi. > > This patch allows pushing case expressions to foreign servers, so that > more types of updates could be executed directly. > > For example, without patch: > > EXPLAIN (VERBOSE, COSTS OFF) > UPDATE ft2 d SET c2 = CASE WHEN c2 > 0 THEN c2 ELSE 0 END > WHERE c1 > 1000; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------- > Update on public.ft2 d > Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2 WHERE ctid = $1 > -> Foreign Scan on public.ft2 d > Output: CASE WHEN (c2 > 0) THEN c2 ELSE 0 END, ctid, d.* > Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM > "S 1"."T 1" WHERE (("C 1" > 1000)) FOR UPDATE > > > EXPLAIN (VERBOSE, COSTS OFF) > UPDATE ft2 d SET c2 = CASE WHEN c2 > 0 THEN c2 ELSE 0 END > WHERE c1 > 1000; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------- > Update on public.ft2 d > -> Foreign Update on public.ft2 d > Remote SQL: UPDATE "S 1"."T 1" SET c2 = (CASE WHEN (c2 > 0) > THEN c2 ELSE 0 END) WHERE (("C 1" > 1000)) > > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional -- Best Wishes, Ashutosh Bapat
Hi. Ashutosh Bapat писал 2021-06-15 16:24: > Looks quite useful to me. Can you please add this to the next > commitfest? > Addded to commitfest. Here is an updated patch version. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
On 2021-06-16 01:29, Alexander Pyhalov wrote: > Hi. > > Ashutosh Bapat писал 2021-06-15 16:24: >> Looks quite useful to me. Can you please add this to the next >> commitfest? >> > > Addded to commitfest. Here is an updated patch version. Thanks for posting the patch. I agree with this content. > + Foreign Scan on public.ft2 (cost=156.58..165.45 rows=394 width=14) It's not a big issue, but is there any intention behind the pattern of outputting costs in regression tests? Regards, -- Yuki Seino Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Seino Yuki писал 2021-06-22 16:03: > On 2021-06-16 01:29, Alexander Pyhalov wrote: >> Hi. >> >> Ashutosh Bapat писал 2021-06-15 16:24: >>> Looks quite useful to me. Can you please add this to the next >>> commitfest? >>> >> >> Addded to commitfest. Here is an updated patch version. > > Thanks for posting the patch. > I agree with this content. > >> + Foreign Scan on public.ft2 (cost=156.58..165.45 rows=394 width=14) > It's not a big issue, but is there any intention behind the pattern of > outputting costs in regression tests? Hi. No, I don't think it makes much sense. Updated tests (also added case with empty else). -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Seino Yuki писал 2021-06-22 16:03:On 2021-06-16 01:29, Alexander Pyhalov wrote:Hi.
Ashutosh Bapat писал 2021-06-15 16:24:Looks quite useful to me. Can you please add this to the next commitfest?
Addded to commitfest. Here is an updated patch version.
Thanks for posting the patch.
I agree with this content.+ Foreign Scan on public.ft2 (cost=156.58..165.45 rows=394 width=14)It's not a big issue, but is there any intention behind the pattern of
outputting costs in regression tests?
Hi.
No, I don't think it makes much sense. Updated tests (also added case with empty else).
The patch doesn't apply anymore to master, I join an update of your patch update in attachment. This is your patch rebased and untouched minus a comment in the test and renamed to v4.
I could have miss something but I don't think that additional struct elements case_args in structs foreign_loc_cxt and deparse_expr_cxt are necessary. They look to be useless.
The patch will also be more clear if the CaseWhen node was handled separately in foreign_expr_walker() instead of being handled in the T_CaseExpr case. By this way the T_CaseExpr case just need to call recursively foreign_expr_walker(). I also think that code in T_CaseTestExpr should just check the collation, there is nothing more to do here like you have commented the function deparseCaseTestExpr(). This function can be removed as it does nothing if the case_args elements are removed.
There is a problem the regression test with nested CASE clauses:
EXPLAIN (VERBOSE, COSTS OFF)
SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDER BY c1;
the original query use "WHERE CASE CASE WHEN" but the remote query is not the same in the plan:
Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" WHERE (((CASE WHEN ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = 100) THEN 601 WHEN ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = c2) THEN c2 ELSE 0 END) > 600)) ORDER BY "C 1" ASC NULLS LAST
Here this is "WHERE (((CASE WHEN ((CASE WHEN" I expected it to be unchanged to "WHERE (((CASE (CASE WHEN".
Also I would like the following regression tests to be added. It test that the CASE clause in aggregate and function is pushed down as well as the aggregate function. This was the original use case that I wanted to fix with this feature.
-- CASE in aggregate function, both must be pushed down
EXPLAIN (VERBOSE, COSTS OFF)
SELECT sum(CASE WHEN mod(c1, 4) = 0 THEN 1 ELSE 2 END) FROM ft1;
-- Same but without the ELSE clause
EXPLAIN (VERBOSE, COSTS OFF)
SELECT sum(CASE WHEN mod(c1, 4) = 0 THEN 1 END) FROM ft1;
For convenience I'm attaching a new patch v5 that change the code following my comments above, fix the nested CASE issue and adds more regression tests.
Best regards,
-- Gilles Darold
Attachment
Hi. Gilles Darold писал 2021-07-07 15:02: > Le 22/06/2021 à 15:39, Alexander Pyhalov a écrit : > >> Seino Yuki писал 2021-06-22 16:03: >> On 2021-06-16 01:29, Alexander Pyhalov wrote: >> Hi. >> >> Ashutosh Bapat писал 2021-06-15 16:24: >> Looks quite useful to me. Can you please add this to the next >> commitfest? >> >> Addded to commitfest. Here is an updated patch version. > > Thanks for posting the patch. > I agree with this content. > >> + Foreign Scan on public.ft2 (cost=156.58..165.45 rows=394 >> width=14) > It's not a big issue, but is there any intention behind the pattern > of > outputting costs in regression tests? > > Hi. > > No, I don't think it makes much sense. Updated tests (also added case > with empty else). > > The patch doesn't apply anymore to master, I join an update of your > patch update in attachment. This is your patch rebased and untouched > minus a comment in the test and renamed to v4. > > I could have miss something but I don't think that additional struct > elements case_args in structs foreign_loc_cxt and deparse_expr_cxt are > necessary. They look to be useless. I thought we should compare arg collation and expression collation and didn't suggest, that we can take CaseTestExpr's collation directly, without deriving it from CaseExpr's arg. Your version of course looks saner. > > The patch will also be more clear if the CaseWhen node was handled > separately in foreign_expr_walker() instead of being handled in the > T_CaseExpr case. By this way the T_CaseExpr case just need to call > recursively foreign_expr_walker(). I also think that code in > T_CaseTestExpr should just check the collation, there is nothing more > to do here like you have commented the function deparseCaseTestExpr(). > This function can be removed as it does nothing if the case_args > elements are removed. > > There is a problem the regression test with nested CASE clauses: > >> EXPLAIN (VERBOSE, COSTS OFF) >> SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END >> WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDER BY c1; > > the original query use "WHERE CASE CASE WHEN" but the remote query is > not the same in the plan: > >> Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" WHERE (((CASE WHEN >> ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = 100) THEN 601 >> WHEN ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = c2) THEN >> c2 ELSE 0 END) > 600)) ORDER BY "C 1" ASC NULLS LAST > > Here this is "WHERE (((CASE WHEN ((CASE WHEN" I expected it to be > unchanged to "WHERE (((CASE (CASE WHEN". I'm not sure this is an issue (as we change CASE A WHEN B ... to CASE WHEN (A=B)), and expressions should be free from side effects, but again your version looks better. Thanks for improving the patch, it looks saner now. -- Best regards, Alexander Pyhalov, Postgres Professional
Le 07/07/2021 à 17:39, Alexander Pyhalov a écrit : > Hi. > > Gilles Darold писал 2021-07-07 15:02: > >> Le 22/06/2021 à 15:39, Alexander Pyhalov a écrit : >> >>> Seino Yuki писал 2021-06-22 16:03: >>> On 2021-06-16 01:29, Alexander Pyhalov wrote: >>> Hi. >>> >>> Ashutosh Bapat писал 2021-06-15 16:24: >>> Looks quite useful to me. Can you please add this to the next >>> commitfest? >>> >>> Addded to commitfest. Here is an updated patch version. >> >> Thanks for posting the patch. >> I agree with this content. >> >>> + Foreign Scan on public.ft2 (cost=156.58..165.45 rows=394 >>> width=14) >> It's not a big issue, but is there any intention behind the pattern >> of >> outputting costs in regression tests? >> >> Hi. >> >> No, I don't think it makes much sense. Updated tests (also added case >> with empty else). >> >> The patch doesn't apply anymore to master, I join an update of your >> patch update in attachment. This is your patch rebased and untouched >> minus a comment in the test and renamed to v4. >> >> I could have miss something but I don't think that additional struct >> elements case_args in structs foreign_loc_cxt and deparse_expr_cxt are >> necessary. They look to be useless. > > I thought we should compare arg collation and expression collation and > didn't suggest, that we can take CaseTestExpr's collation directly, > without deriving it from CaseExpr's arg. Your version of course looks > saner. > >> >> The patch will also be more clear if the CaseWhen node was handled >> separately in foreign_expr_walker() instead of being handled in the >> T_CaseExpr case. By this way the T_CaseExpr case just need to call >> recursively foreign_expr_walker(). I also think that code in >> T_CaseTestExpr should just check the collation, there is nothing more >> to do here like you have commented the function deparseCaseTestExpr(). >> This function can be removed as it does nothing if the case_args >> elements are removed. >> >> There is a problem the regression test with nested CASE clauses: >> >>> EXPLAIN (VERBOSE, COSTS OFF) >>> SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END >>> WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDER BY c1; >> >> the original query use "WHERE CASE CASE WHEN" but the remote query is >> not the same in the plan: >> >>> Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" WHERE (((CASE WHEN >>> ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = 100) THEN 601 >>> WHEN ((CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) = c2) THEN >>> c2 ELSE 0 END) > 600)) ORDER BY "C 1" ASC NULLS LAST >> >> Here this is "WHERE (((CASE WHEN ((CASE WHEN" I expected it to be >> unchanged to "WHERE (((CASE (CASE WHEN". > > I'm not sure this is an issue (as we change CASE A WHEN B ... to CASE > WHEN (A=B)), > and expressions should be free from side effects, but again your version > looks better. Right it returns the same result but I think this is confusing to not see the same case form in the remote query. > > Thanks for improving the patch, it looks saner now. Great, I changing the state in the commitfest to "Ready for committers". -- Gilles Darold MigOps Inc
Le 07/07/2021 à 18:50, Gilles Darold a écrit : > > Great, I changing the state in the commitfest to "Ready for committers". > > I'm attaching the v5 patch again as it doesn't appears in the Latest attachment list in the commitfest. -- Gilles Darold MigOps Inc
Attachment
Le 07/07/2021 à 18:55, Gilles Darold a écrit : > Le 07/07/2021 à 18:50, Gilles Darold a écrit : >> >> Great, I changing the state in the commitfest to "Ready for committers". >> >> > I'm attaching the v5 patch again as it doesn't appears in the Latest > attachment list in the commitfest. > > And the review summary: This patch allows pushing CASE expressions to foreign servers, so that: - more types of updates could be executed directly - full foreign table scan can be avoid - more push down of aggregates function The patch compile and regressions tests with assert enabled passed successfully. There is a compiler warning but it is not related to this patch: deparse.c: In function ‘foreign_expr_walker.isra.0’: deparse.c:891:28: warning: ‘collation’ may be used uninitialized in this function [-Wmaybe-uninitialized] 891 | outer_cxt->collation = collation; | ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ deparse.c:874:10: warning: ‘state’ may be used uninitialized in this function [-Wmaybe-uninitialized] 874 | else if (state == outer_cxt->state) | ^ The regression test for this feature contains the use cases where push down of CASE clause are useful. Nested CASE are also part of the regression tests. The patch adds insignificant overhead by processing further than before a case expression but overall it adds a major performance improvement for queries on foreign tables that use a CASE WHEN clause in a predicate or in an aggregate function. This patch does what it claims to do without detect problem, as expected the CASE clause is not pushed when a local table is involved in the CASE expression of if a non default collation is used. Ready for committers.
Gilles Darold <gilles@migops.com> writes: > I'm attaching the v5 patch again as it doesn't appears in the Latest > attachment list in the commitfest. So this has a few issues: 1. In foreign_expr_walker, you're failing to recurse to either the "arg" or "defresult" subtrees of a CaseExpr, so that it would fail to notice unshippable constructs within those. 2. You're also failing to guard against the hazard that a WHEN expression within a CASE-with-arg has been expanded into something that doesn't look like "CaseTestExpr = something". As written, this patch would likely dump core in that situation, and if it didn't it would send nonsense to the remote server. Take a look at the check for that situation in ruleutils.c (starting at line 8764 as of HEAD) and adapt it to this. Probably what you want is to just deem the CASE un-pushable if it's been modified away from that structure. This is enough of a corner case that optimizing it isn't worth a great deal of trouble ... but crashing is not ok. 3. A potentially uncomfortable issue for the CASE-with-arg syntax is that the specific equality operator being used appears nowhere in the decompiled expression, thus raising the question of whether the remote server will interpret it the same way we did. Given that we restrict the values-to-be-compared to be of shippable types, maybe this is safe in practice, but I have a bad feeling about it. I wonder if we'd be better off just refusing to ship CASE-with-arg at all, which would a-fortiori avoid point 2. 4. I'm not sure that I believe any part of the collation handling. There is the question of what collations will be used for the individual WHEN comparisons, which can probably be left for the recursive checks of the CaseWhen.expr subtrees to handle; and then there is the separate issue of whether the CASE's result collation (which arises from the CaseWhen.result exprs plus the CaseExpr.defresult expr) can be deemed to be safely derived from remote Vars. I haven't totally thought through how that should work, but I'm pretty certain that handling the CaseWhen's within separate recursive invocations of foreign_expr_walker cannot possibly get it right. However, you'll likely have to flatten those anyway (i.e., handle them within the loop in the CaseExpr case) while fixing point 2. 5. This is a cosmetic point, but: the locations of the various additions in deparse.c seem to have been chosen with the aid of a dartboard. We do have a convention for this sort of thing, which is to lay out code concerned with different node types in the same order that the node types are declared in *nodes.h. I'm not sufficiently anal to want to fix the existing violations of that rule that I see in deparse.c; but the fact that somebody got this wrong before isn't license to make things worse. regards, tom lane
Tom Lane писал 2021-07-21 19:49: > Gilles Darold <gilles@migops.com> writes: >> I'm attaching the v5 patch again as it doesn't appears in the Latest >> attachment list in the commitfest. > > So this has a few issues: Hi. > > 1. In foreign_expr_walker, you're failing to recurse to either the > "arg" or "defresult" subtrees of a CaseExpr, so that it would fail > to notice unshippable constructs within those. Fixed this. > > 2. You're also failing to guard against the hazard that a WHEN > expression within a CASE-with-arg has been expanded into something > that doesn't look like "CaseTestExpr = something". As written, > this patch would likely dump core in that situation, and if it didn't > it would send nonsense to the remote server. Take a look at the > check for that situation in ruleutils.c (starting at line 8764 > as of HEAD) and adapt it to this. Probably what you want is to > just deem the CASE un-pushable if it's been modified away from that > structure. This is enough of a corner case that optimizing it > isn't worth a great deal of trouble ... but crashing is not ok. > I think I fixed this by copying check from ruleutils.c. > 3. A potentially uncomfortable issue for the CASE-with-arg syntax > is that the specific equality operator being used appears nowhere > in the decompiled expression, thus raising the question of whether > the remote server will interpret it the same way we did. Given > that we restrict the values-to-be-compared to be of shippable > types, maybe this is safe in practice, but I have a bad feeling > about it. I wonder if we'd be better off just refusing to ship > CASE-with-arg at all, which would a-fortiori avoid point 2. I'm not shure how 'case a when b ...' is different from 'case when a=b ...' in this case. If type of a or b is not shippable, we will not push down this expression in any way. And if they are of builtin types, why do these expressions differ? > > 4. I'm not sure that I believe any part of the collation handling. > There is the question of what collations will be used for the > individual WHEN comparisons, which can probably be left for > the recursive checks of the CaseWhen.expr subtrees to handle; > and then there is the separate issue of whether the CASE's result > collation (which arises from the CaseWhen.result exprs plus the > CaseExpr.defresult expr) can be deemed to be safely derived from > remote Vars. I haven't totally thought through how that should > work, but I'm pretty certain that handling the CaseWhen's within > separate recursive invocations of foreign_expr_walker cannot > possibly get it right. However, you'll likely have to flatten > those anyway (i.e., handle them within the loop in the CaseExpr > case) while fixing point 2. I've tried to account for a fact that we are interested only in caseWhen->result collations, but still not sure that I'm right here. > > 5. This is a cosmetic point, but: the locations of the various > additions in deparse.c seem to have been chosen with the aid > of a dartboard. We do have a convention for this sort of thing, > which is to lay out code concerned with different node types > in the same order that the node types are declared in *nodes.h. > I'm not sufficiently anal to want to fix the existing violations > of that rule that I see in deparse.c; but the fact that somebody > got this wrong before isn't license to make things worse. > > regards, tom lane Fixed this. Thanks for review. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > [ 0001-Allow-pushing-CASE-expression-to-foreign-server-v6.patch ] This doesn't compile cleanly: deparse.c: In function 'foreign_expr_walker.isra.4': deparse.c:920:8: warning: 'collation' may be used uninitialized in this function [-Wmaybe-uninitialized] if (collation != outer_cxt->collation) ^ deparse.c:914:3: warning: 'state' may be used uninitialized in this function [-Wmaybe-uninitialized] switch (state) ^~~~~~ These uninitialized variables very likely explain the fact that it fails regression tests, both for me and for the cfbot. Even if this weren't an outright bug, we don't tolerate code that produces warnings on common compilers. regards, tom lane
Tom Lane писал 2021-07-26 18:18: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> [ 0001-Allow-pushing-CASE-expression-to-foreign-server-v6.patch ] > > This doesn't compile cleanly: > > deparse.c: In function 'foreign_expr_walker.isra.4': > deparse.c:920:8: warning: 'collation' may be used uninitialized in > this function [-Wmaybe-uninitialized] > if (collation != outer_cxt->collation) > ^ > deparse.c:914:3: warning: 'state' may be used uninitialized in this > function [-Wmaybe-uninitialized] > switch (state) > ^~~~~~ > > These uninitialized variables very likely explain the fact that it > fails > regression tests, both for me and for the cfbot. Even if this weren't > an > outright bug, we don't tolerate code that produces warnings on common > compilers. > > regards, tom lane Hi. Of course, this is a patch issue. Don't understand how I overlooked this. Rebased on master and fixed it. Tests are passing here (but they also passed for previous patch version). What exact tests are failing? -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Le 26/07/2021 à 18:03, Alexander Pyhalov a écrit : > Tom Lane писал 2021-07-26 18:18: >> Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >>> [ 0001-Allow-pushing-CASE-expression-to-foreign-server-v6.patch ] >> >> This doesn't compile cleanly: >> >> deparse.c: In function 'foreign_expr_walker.isra.4': >> deparse.c:920:8: warning: 'collation' may be used uninitialized in >> this function [-Wmaybe-uninitialized] >> if (collation != outer_cxt->collation) >> ^ >> deparse.c:914:3: warning: 'state' may be used uninitialized in this >> function [-Wmaybe-uninitialized] >> switch (state) >> ^~~~~~ >> >> These uninitialized variables very likely explain the fact that it fails >> regression tests, both for me and for the cfbot. Even if this >> weren't an >> outright bug, we don't tolerate code that produces warnings on common >> compilers. >> >> regards, tom lane > > Hi. > > Of course, this is a patch issue. Don't understand how I overlooked this. > Rebased on master and fixed it. Tests are passing here (but they also > passed for previous patch version). > > What exact tests are failing? > I confirm that there is no compilation warning and all regression tests pass successfully for the v7 patch, I have not checked previous patch but this one doesn't fail on cfbot too. Best regards, -- Gilles Darold
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > [ 0001-Allow-pushing-CASE-expression-to-foreign-server-v7.patch ] I looked this over. It's better than before, but the collation handling is still not at all correct. We have to consider that a CASE's arg expression supplies the collation for a contained CaseTestExpr, otherwise we'll come to the wrong conclusions about whether "CASE foreignvar WHEN ..." is shippable, if the foreignvar is what's determining collation of the comparisons. This means that the CaseExpr level of recursion has to pass data down to the CaseTestExpr level. In the attached, I did that by adding an additional argument to foreign_expr_walker(). That's a bit invasive, but it's not awful. I thought about instead adding fields to the foreign_loc_cxt struct. But that seemed considerably messier in the end, because we'd then have some fields that are information sourced at one recursion level and some that are info sourced at another level. I also whacked the regression test cases around a lot. They seemed to spend a lot of time on irrelevant combinations, while failing to check the things that matter, namely whether collation-based pushdown decisions are made correctly. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 31919fda8c..d0cbfa8ad9 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -116,7 +116,8 @@ typedef struct deparse_expr_cxt */ static bool foreign_expr_walker(Node *node, foreign_glob_cxt *glob_cxt, - foreign_loc_cxt *outer_cxt); + foreign_loc_cxt *outer_cxt, + foreign_loc_cxt *case_arg_cxt); static char *deparse_type_name(Oid type_oid, int32 typemod); /* @@ -158,6 +159,7 @@ static void deparseScalarArrayOpExpr(ScalarArrayOpExpr *node, static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context); static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context); static void deparseNullTest(NullTest *node, deparse_expr_cxt *context); +static void deparseCaseExpr(CaseExpr *node, deparse_expr_cxt *context); static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context); static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); @@ -254,7 +256,7 @@ is_foreign_expr(PlannerInfo *root, glob_cxt.relids = baserel->relids; loc_cxt.collation = InvalidOid; loc_cxt.state = FDW_COLLATE_NONE; - if (!foreign_expr_walker((Node *) expr, &glob_cxt, &loc_cxt)) + if (!foreign_expr_walker((Node *) expr, &glob_cxt, &loc_cxt, NULL)) return false; /* @@ -283,6 +285,10 @@ is_foreign_expr(PlannerInfo *root, * * In addition, *outer_cxt is updated with collation information. * + * case_arg_cxt is NULL if this subexpression is not inside a CASE-with-arg. + * Otherwise, it points to the collation info derived from the arg expression, + * which must be consulted by any CaseTestExpr. + * * We must check that the expression contains only node types we can deparse, * that all types/functions/operators are safe to send (they are "shippable"), * and that all collations used in the expression derive from Vars of the @@ -294,7 +300,8 @@ is_foreign_expr(PlannerInfo *root, static bool foreign_expr_walker(Node *node, foreign_glob_cxt *glob_cxt, - foreign_loc_cxt *outer_cxt) + foreign_loc_cxt *outer_cxt, + foreign_loc_cxt *case_arg_cxt) { bool check_type = true; PgFdwRelationInfo *fpinfo; @@ -432,17 +439,17 @@ foreign_expr_walker(Node *node, * result, so do those first and reset inner_cxt afterwards. */ if (!foreign_expr_walker((Node *) sr->refupperindexpr, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; inner_cxt.collation = InvalidOid; inner_cxt.state = FDW_COLLATE_NONE; if (!foreign_expr_walker((Node *) sr->reflowerindexpr, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; inner_cxt.collation = InvalidOid; inner_cxt.state = FDW_COLLATE_NONE; if (!foreign_expr_walker((Node *) sr->refexpr, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -478,7 +485,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) fe->args, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -526,7 +533,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) oe->args, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -566,7 +573,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) oe->args, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -592,7 +599,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpression. */ if (!foreign_expr_walker((Node *) r->arg, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -619,7 +626,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) b->args, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* Output is always boolean and so noncollatable. */ @@ -635,7 +642,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) nt->arg, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* Output is always boolean and so noncollatable. */ @@ -643,6 +650,124 @@ foreign_expr_walker(Node *node, state = FDW_COLLATE_NONE; } break; + case T_CaseExpr: + { + CaseExpr *ce = (CaseExpr *) node; + foreign_loc_cxt arg_cxt; + foreign_loc_cxt tmp_cxt; + ListCell *lc; + + /* + * Recurse to CASE's arg expression, if any. Its collation + * has to be saved aside for use while examining CaseTestExprs + * within the WHEN expressions. + */ + arg_cxt.collation = InvalidOid; + arg_cxt.state = FDW_COLLATE_NONE; + if (ce->arg) + { + if (!foreign_expr_walker((Node *) ce->arg, + glob_cxt, &arg_cxt, case_arg_cxt)) + return false; + } + + /* Examine the CaseWhen subexpressions. */ + foreach(lc, ce->args) + { + CaseWhen *cw = lfirst_node(CaseWhen, lc); + Node *whenExpr = (Node *) cw->expr; + + if (ce->arg) + { + /* + * In a CASE-with-arg, the parser should have produced + * WHEN clauses of the form "CaseTestExpr = RHS", + * possibly with an implicit coercion inserted above + * the CaseTestExpr. However in an expression that's + * been through the optimizer, the WHEN clause could + * be almost anything (since the equality operator + * could have been expanded into an inline function). + * In such cases forbid pushdown, because + * deparseCaseExpr can't handle it. + */ + List *whenExprArgs; + + if (!IsA(whenExpr, OpExpr)) + return false; + + whenExprArgs = ((OpExpr *) whenExpr)->args; + if ((list_length(whenExprArgs) != 2) || + !IsA(strip_implicit_coercions(linitial(whenExprArgs)), CaseTestExpr)) + return false; + } + + /* + * Recurse to WHEN expression, passing down the arg info. + * Its collation doesn't affect the result (really, it + * should be boolean and thus not have a collation). + */ + tmp_cxt.collation = InvalidOid; + tmp_cxt.state = FDW_COLLATE_NONE; + if (!foreign_expr_walker((Node *) cw->expr, + glob_cxt, &tmp_cxt, &arg_cxt)) + return false; + + /* Recurse to THEN expression. */ + if (!foreign_expr_walker((Node *) cw->result, + glob_cxt, &inner_cxt, case_arg_cxt)) + return false; + } + + /* Recurse to ELSE expression. */ + if (!foreign_expr_walker((Node *) ce->defresult, + glob_cxt, &inner_cxt, case_arg_cxt)) + return false; + + /* + * Detect whether node is introducing a collation not derived + * from a foreign Var. (If so, we just mark it unsafe for now + * rather than immediately returning false, since the parent + * node might not care.) This is the same as for function + * nodes, except that the input collation is derived from only + * the THEN and ELSE subexpressions. + */ + collation = ce->casecollid; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (inner_cxt.state == FDW_COLLATE_SAFE && + collation == inner_cxt.collation) + state = FDW_COLLATE_SAFE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_NONE; + else + state = FDW_COLLATE_UNSAFE; + } + break; + case T_CaseTestExpr: + { + CaseTestExpr *c = (CaseTestExpr *) node; + + /* Punt if we seem not to be inside a CASE arg WHEN. */ + if (!case_arg_cxt) + return false; + + /* + * Otherwise, any nondefault collation attached to the + * CaseTestExpr node must be derived from foreign Var(s) in + * the CASE arg. + */ + collation = c->collation; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (case_arg_cxt->state == FDW_COLLATE_SAFE && + collation == case_arg_cxt->collation) + state = FDW_COLLATE_SAFE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_NONE; + else + state = FDW_COLLATE_UNSAFE; + } + break; case T_ArrayExpr: { ArrayExpr *a = (ArrayExpr *) node; @@ -651,7 +776,7 @@ foreign_expr_walker(Node *node, * Recurse to input subexpressions. */ if (!foreign_expr_walker((Node *) a->elements, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -681,7 +806,7 @@ foreign_expr_walker(Node *node, foreach(lc, l) { if (!foreign_expr_walker((Node *) lfirst(lc), - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; } @@ -730,7 +855,8 @@ foreign_expr_walker(Node *node, n = (Node *) tle->expr; } - if (!foreign_expr_walker(n, glob_cxt, &inner_cxt)) + if (!foreign_expr_walker(n, + glob_cxt, &inner_cxt, case_arg_cxt)) return false; } @@ -765,7 +891,7 @@ foreign_expr_walker(Node *node, /* Check aggregate filter */ if (!foreign_expr_walker((Node *) agg->aggfilter, - glob_cxt, &inner_cxt)) + glob_cxt, &inner_cxt, case_arg_cxt)) return false; /* @@ -2456,6 +2582,9 @@ deparseExpr(Expr *node, deparse_expr_cxt *context) case T_NullTest: deparseNullTest((NullTest *) node, context); break; + case T_CaseExpr: + deparseCaseExpr((CaseExpr *) node, context); + break; case T_ArrayExpr: deparseArrayExpr((ArrayExpr *) node, context); break; @@ -3007,6 +3136,56 @@ deparseNullTest(NullTest *node, deparse_expr_cxt *context) } } +/* + * Deparse CASE expression + */ +static void +deparseCaseExpr(CaseExpr *node, deparse_expr_cxt *context) +{ + StringInfo buf = context->buf; + ListCell *lc; + + appendStringInfoString(buf, "(CASE"); + + /* If this is a CASE arg WHEN then emit the arg expression */ + if (node->arg != NULL) + { + appendStringInfoChar(buf, ' '); + deparseExpr(node->arg, context); + } + + /* Add each condition/result of the CASE clause */ + foreach(lc, node->args) + { + CaseWhen *whenclause = (CaseWhen *) lfirst(lc); + + /* WHEN */ + appendStringInfoString(buf, " WHEN "); + if (node->arg == NULL) /* CASE WHEN */ + deparseExpr(whenclause->expr, context); + else /* CASE arg WHEN */ + { + /* Ignore the CaseTestExpr and equality operator. */ + deparseExpr(lsecond(castNode(OpExpr, whenclause->expr)->args), + context); + } + + /* THEN */ + appendStringInfoString(buf, " THEN "); + deparseExpr(whenclause->result, context); + } + + /* add ELSE if present */ + if (node->defresult != NULL) + { + appendStringInfoString(buf, " ELSE "); + deparseExpr(node->defresult, context); + } + + /* append END */ + appendStringInfoString(buf, " END)"); +} + /* * Deparse ARRAY[...] construct. */ diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index ed25e7a743..d01b91efab 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -1067,6 +1067,96 @@ SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2 order by t1.c2 limit 1; 1 | 1 | 00001 | Fri Jan 02 00:00:00 1970 PST | Fri Jan 02 00:00:00 1970 | 1 | 1 | foo (1 row) +-- Test CASE pushdown +EXPLAIN (VERBOSE, COSTS OFF) +SELECT c1,c2,c3 FROM ft2 WHERE CASE WHEN c1 > 990 THEN c1 END < 1000 ORDER BY c1; + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan on public.ft2 + Output: c1, c2, c3 + Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" WHERE (((CASE WHEN ("C 1" > 990) THEN "C 1" ELSE NULL::integer END)< 1000)) ORDER BY "C 1" ASC NULLS LAST +(3 rows) + +SELECT c1,c2,c3 FROM ft2 WHERE CASE WHEN c1 > 990 THEN c1 END < 1000 ORDER BY c1; + c1 | c2 | c3 +-----+----+------- + 991 | 1 | 00991 + 992 | 2 | 00992 + 993 | 3 | 00993 + 994 | 4 | 00994 + 995 | 5 | 00995 + 996 | 6 | 00996 + 997 | 7 | 00997 + 998 | 8 | 00998 + 999 | 9 | 00999 +(9 rows) + +-- Nested CASE +EXPLAIN (VERBOSE, COSTS OFF) +SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDERBY c1; + QUERY PLAN +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan on public.ft2 + Output: c1, c2, c3 + Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" WHERE (((CASE (CASE WHEN (c2 > 0) THEN c2 ELSE NULL::integer END) WHEN100 THEN 601 WHEN c2 THEN c2 ELSE 0 END) > 600)) ORDER BY "C 1" ASC NULLS LAST +(3 rows) + +SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDERBY c1; + c1 | c2 | c3 +----+----+---- +(0 rows) + +-- CASE arg WHEN +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE c1 > (CASE mod(c1, 4) WHEN 0 THEN 1 WHEN 2 THEN 50 ELSE 100 END); + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan on public.ft1 + Output: c1, c2, c3, c4, c5, c6, c7, c8 + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" > (CASE mod("C 1", 4) WHEN 0 THEN1 WHEN 2 THEN 50 ELSE 100 END))) +(3 rows) + +-- CASE cannot be pushed down because of unshippable arg clause +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE c1 > (CASE random()::integer WHEN 0 THEN 1 WHEN 2 THEN 50 ELSE 100 END); + QUERY PLAN +----------------------------------------------------------------------------------------- + Foreign Scan on public.ft1 + Output: c1, c2, c3, c4, c5, c6, c7, c8 + Filter: (ft1.c1 > CASE (random())::integer WHEN 0 THEN 1 WHEN 2 THEN 50 ELSE 100 END) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" +(4 rows) + +-- these are shippable +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c6 WHEN 'foo' THEN true ELSE c3 < 'bar' END; + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan on public.ft1 + Output: c1, c2, c3, c4, c5, c6, c7, c8 + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((CASE c6 WHEN 'foo'::text THEN true ELSE(c3 < 'bar'::text) END)) +(3 rows) + +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c3 WHEN c6 THEN true ELSE c3 < 'bar' END; + QUERY PLAN +----------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan on public.ft1 + Output: c1, c2, c3, c4, c5, c6, c7, c8 + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((CASE c3 WHEN c6 THEN true ELSE (c3 < 'bar'::text)END)) +(3 rows) + +-- but this is not because of collation +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c3 COLLATE "C" WHEN c6 THEN true ELSE c3 < 'bar' END; + QUERY PLAN +------------------------------------------------------------------------------------- + Foreign Scan on public.ft1 + Output: c1, c2, c3, c4, c5, c6, c7, c8 + Filter: CASE (ft1.c3)::text WHEN ft1.c6 THEN true ELSE (ft1.c3 < 'bar'::text) END + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" +(4 rows) + -- =================================================================== -- JOIN queries -- =================================================================== diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 02a6b15a13..af7ad15c8f 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -408,6 +408,35 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2 order by t1.c2 limit 1; SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2 order by t1.c2 limit 1; +-- Test CASE pushdown +EXPLAIN (VERBOSE, COSTS OFF) +SELECT c1,c2,c3 FROM ft2 WHERE CASE WHEN c1 > 990 THEN c1 END < 1000 ORDER BY c1; +SELECT c1,c2,c3 FROM ft2 WHERE CASE WHEN c1 > 990 THEN c1 END < 1000 ORDER BY c1; + +-- Nested CASE +EXPLAIN (VERBOSE, COSTS OFF) +SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDERBY c1; + +SELECT c1,c2,c3 FROM ft2 WHERE CASE CASE WHEN c2 > 0 THEN c2 END WHEN 100 THEN 601 WHEN c2 THEN c2 ELSE 0 END > 600 ORDERBY c1; + +-- CASE arg WHEN +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE c1 > (CASE mod(c1, 4) WHEN 0 THEN 1 WHEN 2 THEN 50 ELSE 100 END); + +-- CASE cannot be pushed down because of unshippable arg clause +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE c1 > (CASE random()::integer WHEN 0 THEN 1 WHEN 2 THEN 50 ELSE 100 END); + +-- these are shippable +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c6 WHEN 'foo' THEN true ELSE c3 < 'bar' END; +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c3 WHEN c6 THEN true ELSE c3 < 'bar' END; + +-- but this is not because of collation +EXPLAIN (VERBOSE, COSTS OFF) +SELECT * FROM ft1 WHERE CASE c3 COLLATE "C" WHEN c6 THEN true ELSE c3 < 'bar' END; + -- =================================================================== -- JOIN queries -- ===================================================================
Tom Lane писал 2021-07-29 23:54: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> [ 0001-Allow-pushing-CASE-expression-to-foreign-server-v7.patch ] > > I looked this over. It's better than before, but the collation > handling is still not at all correct. We have to consider that a > CASE's arg expression supplies the collation for a contained > CaseTestExpr, otherwise we'll come to the wrong conclusions about > whether "CASE foreignvar WHEN ..." is shippable, if the foreignvar > is what's determining collation of the comparisons. > > This means that the CaseExpr level of recursion has to pass data down > to the CaseTestExpr level. In the attached, I did that by adding an > additional argument to foreign_expr_walker(). That's a bit invasive, > but it's not awful. I thought about instead adding fields to the > foreign_loc_cxt struct. But that seemed considerably messier in the > end, because we'd then have some fields that are information sourced > at one recursion level and some that are info sourced at another > level. > > I also whacked the regression test cases around a lot. They seemed > to spend a lot of time on irrelevant combinations, while failing to > check the things that matter, namely whether collation-based pushdown > decisions are made correctly. > > regards, tom lane Hi. Overall looks good. The only thing I'm confused about is in T_CaseTestExpr case - how can it be that CaseTestExpr collation doesn't match case_arg_cxt->collation ? Do we we need to inspect only case_arg_cxt->state? Can we assert that collation == case_arg_cxt->collation? -- Best regards, Alexander Pyhalov, Postgres Professional
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > The only thing I'm confused about is in T_CaseTestExpr case - how can it > be that CaseTestExpr collation doesn't match case_arg_cxt->collation ? > Do we we need to inspect only case_arg_cxt->state? Can we assert that > collation == case_arg_cxt->collation? Perhaps, but: (1) I'm disinclined to make this code look different from the otherwise- identical coding elsewhere in foreign_expr_walker. (2) That would create a hard assumption that foreign_expr_walker's conclusions about the collation of a subexpression match those of assign_query_collations. I'm not quite sure I believe that (and if it's true, why aren't we just relying on exprCollation?). Anyway, if we're to have an assertion that it's true, it should be in some place that's a lot less out-of-the-way than CaseTestExpr, because if the assumption gets violated it might be a long time till we notice. So I think we're best off to just write it the way I did, at least so far as this patch is concerned. If we want to rethink the way collation gets calculated here, that would be material for a separate patch. regards, tom lane
I wrote: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> Do we we need to inspect only case_arg_cxt->state? Can we assert that >> collation == case_arg_cxt->collation? > Perhaps, but: > ... Oh, actually there's a third point: the shakiest part of this logic is the assumption that we've correctly matched a CaseTestExpr to its source CaseExpr. Seeing that inlining and constant-folding can mash things to the point where a WHEN expression doesn't look like "CaseTestExpr = RHS", it's a little nervous-making to assume there couldn't be another CASE in between. While there's not known problems of this sort, if it did happen I'd prefer this code to react as "don't push down", not as "assertion failure". (There's been speculation in the past about whether we could find a more bulletproof representation of this kind of CaseExpr. We've not succeeded at that yet though.) regards, tom lane