Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. |
Date | |
Msg-id | 295441.1724964783@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. (Richard Guo <guofenglinux@gmail.com>) |
Responses |
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
|
List | pgsql-hackers |
Richard Guo <guofenglinux@gmail.com> writes: > On Thu, Aug 29, 2024 at 4:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In the meantime, I think this test case is mighty artificial, >> and it wouldn't bother me any to just take it out again for the >> time being. > Yeah, I think we can remove the 't1.two+t2.two' test case if we go > with your proposed patch to address this performance regression. Here's a polished-up patchset for that. I made the memoize test removal a separate patch because (a) it only applies to master and (b) it seems worth calling out as something we might be able to revert later. I found one bug in the draft patch: add_nulling_relids only processes Vars of level zero, so we have to apply it before not after adjusting the Vars' levelsup. An alternative could be to add a levelsup parameter to add_nulling_relids, but that seemed like unnecessary complication. regards, tom lane From 8fafdc4852b8f2164286e6863219eb6b4d267639 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu, 29 Aug 2024 16:25:23 -0400 Subject: [PATCH v1 1/2] Remove one memoize test case added by commit 069d0ff02. This test case turns out to depend on the assumption that a non-Var subquery output that's underneath an outer join will always get wrapped in a PlaceHolderVar. But that behavior causes performance regressions in some cases compared to what happened before v16. The next commit will avoid inserting a PHV in the same cases where pre-v16 did, and that causes get_memoized_path to not detect that a memoize plan could be used. Commit this separately, in hopes that we can restore the test after making get_memoized_path smarter. (It's failing to find memoize plans in adjacent cases where no PHV was ever inserted, so there is definitely room for improvement there.) Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com --- src/test/regress/expected/memoize.out | 30 --------------------------- src/test/regress/sql/memoize.sql | 11 ---------- 2 files changed, 41 deletions(-) diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out index df2ca5ba4e..9ee09fe2f5 100644 --- a/src/test/regress/expected/memoize.out +++ b/src/test/regress/expected/memoize.out @@ -160,36 +160,6 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000; 1000 | 9.5000000000000000 (1 row) --- Try with LATERAL references within PlaceHolderVars -SELECT explain_memoize(' -SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN -LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE -WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false); - explain_memoize --------------------------------------------------------------------------------------- - Aggregate (actual rows=1 loops=N) - -> Nested Loop (actual rows=1000 loops=N) - -> Seq Scan on tenk1 t1 (actual rows=1000 loops=N) - Filter: (unique1 < 1000) - Rows Removed by Filter: 9000 - -> Memoize (actual rows=1 loops=N) - Cache Key: t1.two - Cache Mode: binary - Hits: 998 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB - -> Seq Scan on tenk1 t2 (actual rows=1 loops=N) - Filter: ((t1.two + two) = unique1) - Rows Removed by Filter: 9999 -(12 rows) - --- And check we get the expected results. -SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN -LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE -WHERE s.c1 = s.c2 AND t1.unique1 < 1000; - count | avg --------+-------------------- - 1000 | 9.0000000000000000 -(1 row) - -- Ensure we do not omit the cache keys from PlaceHolderVars SELECT explain_memoize(' SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql index 059bec5f4f..2eaeb1477a 100644 --- a/src/test/regress/sql/memoize.sql +++ b/src/test/regress/sql/memoize.sql @@ -85,17 +85,6 @@ SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN LATERAL (SELECT t1.two+1 AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE WHERE s.c1 = s.c2 AND t1.unique1 < 1000; --- Try with LATERAL references within PlaceHolderVars -SELECT explain_memoize(' -SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN -LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE -WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false); - --- And check we get the expected results. -SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN -LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE -WHERE s.c1 = s.c2 AND t1.unique1 < 1000; - -- Ensure we do not omit the cache keys from PlaceHolderVars SELECT explain_memoize(' SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN -- 2.43.5 From 7b48394838797489e7ab869f97ca06449fdbcee3 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu, 29 Aug 2024 16:43:35 -0400 Subject: [PATCH v1 2/2] Avoid inserting PlaceHolderVars in cases where pre-v16 PG did not. Commit 2489d76c4 removed some logic from pullup_replace_vars() that avoided wrapping a PlaceHolderVar around a pulled-up subquery output expression if the expression could be proven to go to NULL anyway (because it contained Vars or PHVs of the pulled-up relation and did not contain non-strict constructs). But removing that logic turns out to cause performance regressions in some cases, because the extra PHV prevents subexpression folding, and will do so even if outer-join reduction later turns it into a no-op with no phnullingrels bits. The reason for always adding a PHV was to ensure we had someplace to put the varnullingrels marker bits of the Var being replaced. However, it turns out we can optimize in exactly the same cases that the previous code did, because we can attach the needed varnullingrels bits to the contained Var(s)/PHV(s). This is not a complete solution --- it would be even better if we could remove PHVs after reducing them to no-ops. It doesn't look practical to back-patch such an improvement, but this change seems safe and at least gets rid of the performance-regression cases. Per complaint from Nikhil Raj. Back-patch to v16 where the problem appeared. Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com --- src/backend/optimizer/prep/prepjointree.c | 66 ++++++++++++++++++----- src/backend/rewrite/rewriteManip.c | 9 ++-- src/test/regress/expected/subselect.out | 29 ++++++++++ src/test/regress/sql/subselect.sql | 18 +++++++ 4 files changed, 107 insertions(+), 15 deletions(-) diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 969e257f70..34fbf8ee23 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -2490,14 +2490,48 @@ pullup_replace_vars_callback(Var *var, else wrap = false; } + else if (rcon->wrap_non_vars) + { + /* Caller told us to wrap all non-Vars in a PlaceHolderVar */ + wrap = true; + } else { /* - * Must wrap, either because we need a place to insert - * varnullingrels or because caller told us to wrap - * everything. + * If the node contains Var(s) or PlaceHolderVar(s) of the + * subquery being pulled up, and does not contain any + * non-strict constructs, then instead of adding a PHV on top + * we can add the required nullingrels to those Vars/PHVs. + * (This is fundamentally a generalization of the above cases + * for bare Vars and PHVs.) + * + * This test is somewhat expensive, but it avoids pessimizing + * the plan in cases where the nullingrels get removed again + * later by outer join reduction. + * + * This analysis could be tighter: in particular, a non-strict + * construct hidden within a lower-level PlaceHolderVar is not + * reason to add another PHV. But for now it doesn't seem + * worth the code to be more exact. + * + * For a LATERAL subquery, we have to check the actual var + * membership of the node, but if it's non-lateral then any + * level-zero var must belong to the subquery. */ - wrap = true; + if ((rcon->target_rte->lateral ? + bms_overlap(pull_varnos(rcon->root, newnode), + rcon->relids) : + contain_vars_of_level(newnode, 0)) && + !contain_nonstrict_functions(newnode)) + { + /* No wrap needed */ + wrap = false; + } + else + { + /* Else wrap it in a PlaceHolderVar */ + wrap = true; + } } if (wrap) @@ -2518,18 +2552,14 @@ pullup_replace_vars_callback(Var *var, } } - /* Must adjust varlevelsup if replaced Var is within a subquery */ - if (var->varlevelsup > 0) - IncrementVarSublevelsUp(newnode, var->varlevelsup, 0); - - /* Propagate any varnullingrels into the replacement Var or PHV */ + /* Propagate any varnullingrels into the replacement expression */ if (var->varnullingrels != NULL) { if (IsA(newnode, Var)) { Var *newvar = (Var *) newnode; - Assert(newvar->varlevelsup == var->varlevelsup); + Assert(newvar->varlevelsup == 0); newvar->varnullingrels = bms_add_members(newvar->varnullingrels, var->varnullingrels); } @@ -2537,14 +2567,26 @@ pullup_replace_vars_callback(Var *var, { PlaceHolderVar *newphv = (PlaceHolderVar *) newnode; - Assert(newphv->phlevelsup == var->varlevelsup); + Assert(newphv->phlevelsup == 0); newphv->phnullingrels = bms_add_members(newphv->phnullingrels, var->varnullingrels); } else - elog(ERROR, "failed to wrap a non-Var"); + { + /* There should be lower-level Vars/PHVs we can modify */ + newnode = add_nulling_relids(newnode, + NULL, /* modify all Vars/PHVs */ + var->varnullingrels); + /* Assert we did put the varnullingrels into the expression */ + Assert(bms_is_subset(var->varnullingrels, + pull_varnos(rcon->root, newnode))); + } } + /* Must adjust varlevelsup if replaced Var is within a subquery */ + if (var->varlevelsup > 0) + IncrementVarSublevelsUp(newnode, var->varlevelsup, 0); + return newnode; } diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 191f2dc0b1..b20625fbd2 100644 --- a/src/backend/rewrite/rewriteManip.c +++ b/src/backend/rewrite/rewriteManip.c @@ -1141,7 +1141,8 @@ AddInvertedQual(Query *parsetree, Node *qual) /* * add_nulling_relids() finds Vars and PlaceHolderVars that belong to any * of the target_relids, and adds added_relids to their varnullingrels - * and phnullingrels fields. + * and phnullingrels fields. If target_relids is NULL, all level-zero + * Vars and PHVs are modified. */ Node * add_nulling_relids(Node *node, @@ -1170,7 +1171,8 @@ add_nulling_relids_mutator(Node *node, Var *var = (Var *) node; if (var->varlevelsup == context->sublevels_up && - bms_is_member(var->varno, context->target_relids)) + (context->target_relids == NULL || + bms_is_member(var->varno, context->target_relids))) { Relids newnullingrels = bms_union(var->varnullingrels, context->added_relids); @@ -1188,7 +1190,8 @@ add_nulling_relids_mutator(Node *node, PlaceHolderVar *phv = (PlaceHolderVar *) node; if (phv->phlevelsup == context->sublevels_up && - bms_overlap(phv->phrels, context->target_relids)) + (context->target_relids == NULL || + bms_overlap(phv->phrels, context->target_relids))) { Relids newnullingrels = bms_union(phv->phnullingrels, context->added_relids); diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index 9eecdc1e92..2d35de3fad 100644 --- a/src/test/regress/expected/subselect.out +++ b/src/test/regress/expected/subselect.out @@ -1721,6 +1721,35 @@ fetch backward all in c1; (2 rows) commit; +-- +-- Verify that we correctly flatten cases involving a subquery output +-- expression that doesn't need to be wrapped in a PlaceHolderVar +-- +explain (costs off) +select tname, attname from ( +select relname::information_schema.sql_identifier as tname, * from + (select * from pg_class c) ss1) ss2 + right join pg_attribute a on a.attrelid = ss2.oid +where tname = 'tenk1' and attnum = 1; + QUERY PLAN +-------------------------------------------------------------------------- + Nested Loop + -> Index Scan using pg_class_relname_nsp_index on pg_class c + Index Cond: (relname = 'tenk1'::name) + -> Index Scan using pg_attribute_relid_attnum_index on pg_attribute a + Index Cond: ((attrelid = c.oid) AND (attnum = 1)) +(5 rows) + +select tname, attname from ( +select relname::information_schema.sql_identifier as tname, * from + (select * from pg_class c) ss1) ss2 + right join pg_attribute a on a.attrelid = ss2.oid +where tname = 'tenk1' and attnum = 1; + tname | attname +-------+--------- + tenk1 | unique1 +(1 row) + -- -- Tests for CTE inlining behavior -- diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql index 75a9b718b2..af6e157aca 100644 --- a/src/test/regress/sql/subselect.sql +++ b/src/test/regress/sql/subselect.sql @@ -890,6 +890,24 @@ fetch backward all in c1; commit; +-- +-- Verify that we correctly flatten cases involving a subquery output +-- expression that doesn't need to be wrapped in a PlaceHolderVar +-- + +explain (costs off) +select tname, attname from ( +select relname::information_schema.sql_identifier as tname, * from + (select * from pg_class c) ss1) ss2 + right join pg_attribute a on a.attrelid = ss2.oid +where tname = 'tenk1' and attnum = 1; + +select tname, attname from ( +select relname::information_schema.sql_identifier as tname, * from + (select * from pg_class c) ss1) ss2 + right join pg_attribute a on a.attrelid = ss2.oid +where tname = 'tenk1' and attnum = 1; + -- -- Tests for CTE inlining behavior -- -- 2.43.5
pgsql-hackers by date: