Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Date
Msg-id 295441.1724964783@sss.pgh.pa.us
Whole thread Raw
In response to Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
List pgsql-hackers
Richard Guo <guofenglinux@gmail.com> writes:
> On Thu, Aug 29, 2024 at 4:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the meantime, I think this test case is mighty artificial,
>> and it wouldn't bother me any to just take it out again for the
>> time being.

> Yeah, I think we can remove the 't1.two+t2.two' test case if we go
> with your proposed patch to address this performance regression.

Here's a polished-up patchset for that.  I made the memoize test
removal a separate patch because (a) it only applies to master
and (b) it seems worth calling out as something we might be able
to revert later.

I found one bug in the draft patch: add_nulling_relids only processes
Vars of level zero, so we have to apply it before not after adjusting
the Vars' levelsup.  An alternative could be to add a levelsup
parameter to add_nulling_relids, but that seemed like unnecessary
complication.

            regards, tom lane

From 8fafdc4852b8f2164286e6863219eb6b4d267639 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 29 Aug 2024 16:25:23 -0400
Subject: [PATCH v1 1/2] Remove one memoize test case added by commit
 069d0ff02.

This test case turns out to depend on the assumption that a non-Var
subquery output that's underneath an outer join will always get
wrapped in a PlaceHolderVar.  But that behavior causes performance
regressions in some cases compared to what happened before v16.
The next commit will avoid inserting a PHV in the same cases where
pre-v16 did, and that causes get_memoized_path to not detect that
a memoize plan could be used.

Commit this separately, in hopes that we can restore the test after
making get_memoized_path smarter.  (It's failing to find memoize
plans in adjacent cases where no PHV was ever inserted, so there
is definitely room for improvement there.)

Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com
---
 src/test/regress/expected/memoize.out | 30 ---------------------------
 src/test/regress/sql/memoize.sql      | 11 ----------
 2 files changed, 41 deletions(-)

diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index df2ca5ba4e..9ee09fe2f5 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -160,36 +160,6 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;
   1000 | 9.5000000000000000
 (1 row)

--- Try with LATERAL references within PlaceHolderVars
-SELECT explain_memoize('
-SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
-LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE
-WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
-                                   explain_memoize
---------------------------------------------------------------------------------------
- Aggregate (actual rows=1 loops=N)
-   ->  Nested Loop (actual rows=1000 loops=N)
-         ->  Seq Scan on tenk1 t1 (actual rows=1000 loops=N)
-               Filter: (unique1 < 1000)
-               Rows Removed by Filter: 9000
-         ->  Memoize (actual rows=1 loops=N)
-               Cache Key: t1.two
-               Cache Mode: binary
-               Hits: 998  Misses: 2  Evictions: Zero  Overflows: 0  Memory Usage: NkB
-               ->  Seq Scan on tenk1 t2 (actual rows=1 loops=N)
-                     Filter: ((t1.two + two) = unique1)
-                     Rows Removed by Filter: 9999
-(12 rows)
-
--- And check we get the expected results.
-SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
-LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE
-WHERE s.c1 = s.c2 AND t1.unique1 < 1000;
- count |        avg
--------+--------------------
-  1000 | 9.0000000000000000
-(1 row)
-
 -- Ensure we do not omit the cache keys from PlaceHolderVars
 SELECT explain_memoize('
 SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 059bec5f4f..2eaeb1477a 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -85,17 +85,6 @@ SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
 LATERAL (SELECT t1.two+1 AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE
 WHERE s.c1 = s.c2 AND t1.unique1 < 1000;

--- Try with LATERAL references within PlaceHolderVars
-SELECT explain_memoize('
-SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
-LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE
-WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
-
--- And check we get the expected results.
-SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
-LATERAL (SELECT t1.two+t2.two AS c1, t2.unique1 AS c2 FROM tenk1 t2) s ON TRUE
-WHERE s.c1 = s.c2 AND t1.unique1 < 1000;
-
 -- Ensure we do not omit the cache keys from PlaceHolderVars
 SELECT explain_memoize('
 SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
--
2.43.5

From 7b48394838797489e7ab869f97ca06449fdbcee3 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 29 Aug 2024 16:43:35 -0400
Subject: [PATCH v1 2/2] Avoid inserting PlaceHolderVars in cases where pre-v16
 PG did not.

Commit 2489d76c4 removed some logic from pullup_replace_vars()
that avoided wrapping a PlaceHolderVar around a pulled-up
subquery output expression if the expression could be proven
to go to NULL anyway (because it contained Vars or PHVs of the
pulled-up relation and did not contain non-strict constructs).
But removing that logic turns out to cause performance regressions
in some cases, because the extra PHV prevents subexpression folding,
and will do so even if outer-join reduction later turns it into a
no-op with no phnullingrels bits.

The reason for always adding a PHV was to ensure we had someplace
to put the varnullingrels marker bits of the Var being replaced.
However, it turns out we can optimize in exactly the same cases that
the previous code did, because we can attach the needed varnullingrels
bits to the contained Var(s)/PHV(s).

This is not a complete solution --- it would be even better if we
could remove PHVs after reducing them to no-ops.  It doesn't look
practical to back-patch such an improvement, but this change seems
safe and at least gets rid of the performance-regression cases.

Per complaint from Nikhil Raj.  Back-patch to v16 where the
problem appeared.

Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com
---
 src/backend/optimizer/prep/prepjointree.c | 66 ++++++++++++++++++-----
 src/backend/rewrite/rewriteManip.c        |  9 ++--
 src/test/regress/expected/subselect.out   | 29 ++++++++++
 src/test/regress/sql/subselect.sql        | 18 +++++++
 4 files changed, 107 insertions(+), 15 deletions(-)

diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 969e257f70..34fbf8ee23 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -2490,14 +2490,48 @@ pullup_replace_vars_callback(Var *var,
                 else
                     wrap = false;
             }
+            else if (rcon->wrap_non_vars)
+            {
+                /* Caller told us to wrap all non-Vars in a PlaceHolderVar */
+                wrap = true;
+            }
             else
             {
                 /*
-                 * Must wrap, either because we need a place to insert
-                 * varnullingrels or because caller told us to wrap
-                 * everything.
+                 * If the node contains Var(s) or PlaceHolderVar(s) of the
+                 * subquery being pulled up, and does not contain any
+                 * non-strict constructs, then instead of adding a PHV on top
+                 * we can add the required nullingrels to those Vars/PHVs.
+                 * (This is fundamentally a generalization of the above cases
+                 * for bare Vars and PHVs.)
+                 *
+                 * This test is somewhat expensive, but it avoids pessimizing
+                 * the plan in cases where the nullingrels get removed again
+                 * later by outer join reduction.
+                 *
+                 * This analysis could be tighter: in particular, a non-strict
+                 * construct hidden within a lower-level PlaceHolderVar is not
+                 * reason to add another PHV.  But for now it doesn't seem
+                 * worth the code to be more exact.
+                 *
+                 * For a LATERAL subquery, we have to check the actual var
+                 * membership of the node, but if it's non-lateral then any
+                 * level-zero var must belong to the subquery.
                  */
-                wrap = true;
+                if ((rcon->target_rte->lateral ?
+                     bms_overlap(pull_varnos(rcon->root, newnode),
+                                 rcon->relids) :
+                     contain_vars_of_level(newnode, 0)) &&
+                    !contain_nonstrict_functions(newnode))
+                {
+                    /* No wrap needed */
+                    wrap = false;
+                }
+                else
+                {
+                    /* Else wrap it in a PlaceHolderVar */
+                    wrap = true;
+                }
             }

             if (wrap)
@@ -2518,18 +2552,14 @@ pullup_replace_vars_callback(Var *var,
         }
     }

-    /* Must adjust varlevelsup if replaced Var is within a subquery */
-    if (var->varlevelsup > 0)
-        IncrementVarSublevelsUp(newnode, var->varlevelsup, 0);
-
-    /* Propagate any varnullingrels into the replacement Var or PHV */
+    /* Propagate any varnullingrels into the replacement expression */
     if (var->varnullingrels != NULL)
     {
         if (IsA(newnode, Var))
         {
             Var           *newvar = (Var *) newnode;

-            Assert(newvar->varlevelsup == var->varlevelsup);
+            Assert(newvar->varlevelsup == 0);
             newvar->varnullingrels = bms_add_members(newvar->varnullingrels,
                                                      var->varnullingrels);
         }
@@ -2537,14 +2567,26 @@ pullup_replace_vars_callback(Var *var,
         {
             PlaceHolderVar *newphv = (PlaceHolderVar *) newnode;

-            Assert(newphv->phlevelsup == var->varlevelsup);
+            Assert(newphv->phlevelsup == 0);
             newphv->phnullingrels = bms_add_members(newphv->phnullingrels,
                                                     var->varnullingrels);
         }
         else
-            elog(ERROR, "failed to wrap a non-Var");
+        {
+            /* There should be lower-level Vars/PHVs we can modify */
+            newnode = add_nulling_relids(newnode,
+                                         NULL,    /* modify all Vars/PHVs */
+                                         var->varnullingrels);
+            /* Assert we did put the varnullingrels into the expression */
+            Assert(bms_is_subset(var->varnullingrels,
+                                 pull_varnos(rcon->root, newnode)));
+        }
     }

+    /* Must adjust varlevelsup if replaced Var is within a subquery */
+    if (var->varlevelsup > 0)
+        IncrementVarSublevelsUp(newnode, var->varlevelsup, 0);
+
     return newnode;
 }

diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 191f2dc0b1..b20625fbd2 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -1141,7 +1141,8 @@ AddInvertedQual(Query *parsetree, Node *qual)
 /*
  * add_nulling_relids() finds Vars and PlaceHolderVars that belong to any
  * of the target_relids, and adds added_relids to their varnullingrels
- * and phnullingrels fields.
+ * and phnullingrels fields.  If target_relids is NULL, all level-zero
+ * Vars and PHVs are modified.
  */
 Node *
 add_nulling_relids(Node *node,
@@ -1170,7 +1171,8 @@ add_nulling_relids_mutator(Node *node,
         Var           *var = (Var *) node;

         if (var->varlevelsup == context->sublevels_up &&
-            bms_is_member(var->varno, context->target_relids))
+            (context->target_relids == NULL ||
+             bms_is_member(var->varno, context->target_relids)))
         {
             Relids        newnullingrels = bms_union(var->varnullingrels,
                                                    context->added_relids);
@@ -1188,7 +1190,8 @@ add_nulling_relids_mutator(Node *node,
         PlaceHolderVar *phv = (PlaceHolderVar *) node;

         if (phv->phlevelsup == context->sublevels_up &&
-            bms_overlap(phv->phrels, context->target_relids))
+            (context->target_relids == NULL ||
+             bms_overlap(phv->phrels, context->target_relids)))
         {
             Relids        newnullingrels = bms_union(phv->phnullingrels,
                                                    context->added_relids);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 9eecdc1e92..2d35de3fad 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1721,6 +1721,35 @@ fetch backward all in c1;
 (2 rows)

 commit;
+--
+-- Verify that we correctly flatten cases involving a subquery output
+-- expression that doesn't need to be wrapped in a PlaceHolderVar
+--
+explain (costs off)
+select tname, attname from (
+select relname::information_schema.sql_identifier as tname, * from
+  (select * from pg_class c) ss1) ss2
+  right join pg_attribute a on a.attrelid = ss2.oid
+where tname = 'tenk1' and attnum = 1;
+                                QUERY PLAN
+--------------------------------------------------------------------------
+ Nested Loop
+   ->  Index Scan using pg_class_relname_nsp_index on pg_class c
+         Index Cond: (relname = 'tenk1'::name)
+   ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a
+         Index Cond: ((attrelid = c.oid) AND (attnum = 1))
+(5 rows)
+
+select tname, attname from (
+select relname::information_schema.sql_identifier as tname, * from
+  (select * from pg_class c) ss1) ss2
+  right join pg_attribute a on a.attrelid = ss2.oid
+where tname = 'tenk1' and attnum = 1;
+ tname | attname
+-------+---------
+ tenk1 | unique1
+(1 row)
+
 --
 -- Tests for CTE inlining behavior
 --
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 75a9b718b2..af6e157aca 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -890,6 +890,24 @@ fetch backward all in c1;

 commit;

+--
+-- Verify that we correctly flatten cases involving a subquery output
+-- expression that doesn't need to be wrapped in a PlaceHolderVar
+--
+
+explain (costs off)
+select tname, attname from (
+select relname::information_schema.sql_identifier as tname, * from
+  (select * from pg_class c) ss1) ss2
+  right join pg_attribute a on a.attrelid = ss2.oid
+where tname = 'tenk1' and attnum = 1;
+
+select tname, attname from (
+select relname::information_schema.sql_identifier as tname, * from
+  (select * from pg_class c) ss1) ss2
+  right join pg_attribute a on a.attrelid = ss2.oid
+where tname = 'tenk1' and attnum = 1;
+
 --
 -- Tests for CTE inlining behavior
 --
--
2.43.5


pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Next
From: Andrew Dunstan
Date:
Subject: Re: pl/pgperl Patch for adding $_FN detail just like triggers have for $_TD