Re: Early WIP/PoC for inlining CTEs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Early WIP/PoC for inlining CTEs
Date
Msg-id 11140.1548454956@sss.pgh.pa.us
Whole thread Raw
In response to Re: Early WIP/PoC for inlining CTEs  (Andreas Karlsson <andreas@proxel.se>)
Responses Re: Early WIP/PoC for inlining CTEs  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Early WIP/PoC for inlining CTEs  (Marko Tiikkaja <marko@joh.to>)
List pgsql-hackers
Andreas Karlsson <andreas@proxel.se> writes:
> [ inlining-ctes-v8.patch ]

I went ahead and pushed the stuff about QTW_EXAMINE_RTES_BEFORE/_AFTER,
because that seems like an independent change with other possible uses.

Attached is an updated version of the rest of the patch, with mostly
cosmetic changes.  I've not touched the documentation, but I think this
is otherwise committable if we are satisfied with the semantics.

However ... after thinking about it more, I'm not really satisfied
with that.  In particular I don't like the fact that by default this
will inline regardless of the number of references to the CTE.  I doubt
that inlining when there are multiple references is so likely to be a
win as to justify it being the default, especially given that it flies
in the face of what our documentation has said for as long as we've
had CTEs.

Therefore, I'm reversing my previous opinion that we should not have
an explicit NOT MATERIALIZED option.  I think we should add that, and
the behavior ought to be:

* No option given: inline if there's exactly one reference.

* With MATERIALIZED: never inline.

* With NOT MATERIALIZED: inline regardless of the number of references.

(Obviously, we should not inline if there's RECURSIVE or the CTE
potentially has side-effects, regardless of the user option;
I don't think those cases are up for debate.)

I haven't done anything about that here, but the changes would be pretty
minor.

            regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index f177eba..6d456f6 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2925,6 +2925,7 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)

                 /* we store the string name because RTE_CTE RTEs need it */
                 APP_JUMB_STRING(cte->ctename);
+                APP_JUMB(cte->ctematerialized);
                 JumbleQuery(jstate, castNode(Query, cte->ctequery));
             }
             break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d..8c26dd1 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -1888,7 +1888,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t

 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1
FROMt ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1))
SELECTc1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
                                                              QUERY PLAN
              

-------------------------------------------------------------------------------------------------------------------------------------
  Limit
@@ -1905,7 +1905,7 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
                Output: t.c1_1, t.c2_1, t.c1_3
 (12 rows)

-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1
FROMt ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1))
SELECTc1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
  c1_1 | c2_1
 ------+------
   101 |  101
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index f438165..56602a1 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -493,8 +493,8 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1
FROMt ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1
FROMt ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1))
SELECTc1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1))
SELECTc1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10; 
 -- ctid with whole-row reference
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 88bc189..92f180c 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2199,6 +2199,8 @@ SELECT n FROM t LIMIT 100;
   </para>

   <para>
+   TODO: Update this for inlining and <literal>MATERIALIZED</literal>.
+
    A useful property of <literal>WITH</literal> queries is that they are evaluated
    only once per execution of the parent query, even if they are referred to
    more than once by the parent query or sibling <literal>WITH</literal> queries.
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142..92ede4f 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -72,7 +72,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac

 <phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>

-    <replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable
class="parameter">column_name</replaceable>[, ...] ) ] AS ( <replaceable class="parameter">select</replaceable> |
<replaceableclass="parameter">values</replaceable> | <replaceable class="parameter">insert</replaceable> | <replaceable
class="parameter">update</replaceable>| <replaceable class="parameter">delete</replaceable> ) 
+    <replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable
class="parameter">column_name</replaceable>[, ...] ) ] AS [ MATERIALIZED ] ( <replaceable
class="parameter">select</replaceable>| <replaceable class="parameter">values</replaceable> | <replaceable
class="parameter">insert</replaceable>| <replaceable class="parameter">update</replaceable> | <replaceable
class="parameter">delete</replaceable>) 

 TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
 </synopsis>
@@ -273,6 +273,8 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
    </para>

    <para>
+    TODO: Update this for inlining and <literal>MATERIALIZED</literal>.
+
     A key property of <literal>WITH</literal> queries is that they
     are evaluated only once per execution of the primary query,
     even if the primary query refers to them more than once.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 3eb7e95..3b6e206 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2536,6 +2536,7 @@ _copyCommonTableExpr(const CommonTableExpr *from)

     COPY_STRING_FIELD(ctename);
     COPY_NODE_FIELD(aliascolnames);
+    COPY_SCALAR_FIELD(ctematerialized);
     COPY_NODE_FIELD(ctequery);
     COPY_LOCATION_FIELD(location);
     COPY_SCALAR_FIELD(cterecursive);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5c4fa7d..e7b0fae 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2791,6 +2791,7 @@ _equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
 {
     COMPARE_STRING_FIELD(ctename);
     COMPARE_NODE_FIELD(aliascolnames);
+    COMPARE_SCALAR_FIELD(ctematerialized);
     COMPARE_NODE_FIELD(ctequery);
     COMPARE_LOCATION_FIELD(location);
     COMPARE_SCALAR_FIELD(cterecursive);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 0fde876..b8467cd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2980,6 +2980,7 @@ _outCommonTableExpr(StringInfo str, const CommonTableExpr *node)

     WRITE_STRING_FIELD(ctename);
     WRITE_NODE_FIELD(aliascolnames);
+    WRITE_BOOL_FIELD(ctematerialized);
     WRITE_NODE_FIELD(ctequery);
     WRITE_LOCATION_FIELD(location);
     WRITE_BOOL_FIELD(cterecursive);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ec6f256..c653ee3 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -418,6 +418,7 @@ _readCommonTableExpr(void)

     READ_STRING_FIELD(ctename);
     READ_NODE_FIELD(aliascolnames);
+    READ_BOOL_FIELD(ctematerialized);
     READ_NODE_FIELD(ctequery);
     READ_LOCATION_FIELD(location);
     READ_BOOL_FIELD(cterecursive);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4465f00..67f276a 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -645,8 +645,8 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
     root->partColsUpdated = false;

     /*
-     * If there is a WITH list, process each WITH query and build an initplan
-     * SubPlan structure for it.
+     * If there is a WITH list, process each WITH query and either convert it
+     * to RTE_SUBQUERY RTE(s) or build an initplan SubPlan structure for it.
      */
     if (parse->cteList)
         SS_process_ctes(root);
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 64272dd..8dd6a20 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -57,6 +57,14 @@ typedef struct finalize_primnode_context
     Bitmapset  *paramids;        /* Non-local PARAM_EXEC paramids found */
 } finalize_primnode_context;

+typedef struct inline_cte_walker_context
+{
+    const char *ctename;        /* name and relative level of target CTE */
+    int            levelsup;
+    int            refcount;        /* number of remaining references */
+    Query       *ctequery;        /* query to substitute */
+} inline_cte_walker_context;
+

 static Node *build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot,
               List *plan_params,
@@ -75,6 +83,10 @@ static Node *convert_testexpr_mutator(Node *node,
 static bool subplan_is_hashable(Plan *plan);
 static bool testexpr_is_hashable(Node *testexpr);
 static bool hash_ok_operator(OpExpr *expr);
+static bool contain_dml(Node *node);
+static bool contain_dml_walker(Node *node, void *context);
+static void inline_cte(PlannerInfo *root, CommonTableExpr *cte);
+static bool inline_cte_walker(Node *node, inline_cte_walker_context *context);
 static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
                       Node **testexpr, List **paramIds);
@@ -804,10 +816,13 @@ hash_ok_operator(OpExpr *expr)
 /*
  * SS_process_ctes: process a query's WITH list
  *
- * We plan each interesting WITH item and convert it to an initplan.
+ * Consider each CTE in the WITH list and either ignore it (if it's an
+ * unreferenced SELECT), "inline" it to create a regular sub-SELECT-in-FROM,
+ * or convert it to an initplan.
+ *
  * A side effect is to fill in root->cte_plan_ids with a list that
  * parallels root->parse->cteList and provides the subplan ID for
- * each CTE's initplan.
+ * each CTE's initplan, or a dummy ID (-1) if we didn't make an initplan.
  */
 void
 SS_process_ctes(PlannerInfo *root)
@@ -839,6 +854,39 @@ SS_process_ctes(PlannerInfo *root)
         }

         /*
+         * Consider inlining the CTE (creating RTE_SUBQUERY RTE(s)) instead of
+         * implementing it as a separately-planned CTE.
+         *
+         * We cannot inline if any of these conditions hold:
+         *
+         * 1. The user said not to (the MATERIALIZED option).
+         *
+         * 2. The CTE is recursive.
+         *
+         * 3. The CTE has side-effects; this includes either not being a plain
+         * SELECT, or containing volatile functions.  Inlining might change
+         * the side-effects, which is not good.
+         *
+         * Otherwise, we always inline.  This could be a net loss if the CTE
+         * is multiply referenced, but not necessarily so: the ability to
+         * absorb restrictions from the outer query level could outweigh the
+         * need for duplicate computations.  We do not have nearly enough
+         * information at this point to tell if that's true, so we make the
+         * user say MATERIALIZED if it's not.
+         */
+        if (!cte->ctematerialized &&
+            !cte->cterecursive &&
+            cmdType == CMD_SELECT &&
+            !contain_dml(cte->ctequery) &&
+            !contain_volatile_functions(cte->ctequery))
+        {
+            inline_cte(root, cte);
+            /* Make a dummy entry in cte_plan_ids */
+            root->cte_plan_ids = lappend_int(root->cte_plan_ids, -1);
+            continue;
+        }
+
+        /*
          * Copy the source Query node.  Probably not necessary, but let's keep
          * this similar to make_subplan.
          */
@@ -935,6 +983,127 @@ SS_process_ctes(PlannerInfo *root)
 }

 /*
+ * contain_dml: is any subquery not a plain SELECT?
+ *
+ * We reject SELECT FOR UPDATE/SHARE as well as INSERT etc.
+ */
+static bool
+contain_dml(Node *node)
+{
+    return contain_dml_walker(node, NULL);
+}
+
+static bool
+contain_dml_walker(Node *node, void *context)
+{
+    if (node == NULL)
+        return false;
+    if (IsA(node, Query))
+    {
+        Query       *query = (Query *) node;
+
+        if (query->commandType != CMD_SELECT ||
+            query->rowMarks != NIL)
+            return true;
+
+        return query_tree_walker(query, contain_dml_walker, context, 0);
+    }
+    return expression_tree_walker(node, contain_dml_walker, context);
+}
+
+/*
+ * inline_cte: convert RTE_CTE references to given CTE into RTE_SUBQUERYs
+ */
+static void
+inline_cte(PlannerInfo *root, CommonTableExpr *cte)
+{
+    struct inline_cte_walker_context context;
+
+    context.ctename = cte->ctename;
+    /* Start at levelsup = -1 because we'll immediately increment it */
+    context.levelsup = -1;
+    context.refcount = cte->cterefcount;
+    context.ctequery = castNode(Query, cte->ctequery);
+
+    (void) inline_cte_walker((Node *) root->parse, &context);
+
+    /* Assert we replaced all references */
+    Assert(context.refcount == 0);
+}
+
+static bool
+inline_cte_walker(Node *node, inline_cte_walker_context *context)
+{
+    if (node == NULL)
+        return false;
+    if (IsA(node, Query))
+    {
+        Query       *query = (Query *) node;
+
+        context->levelsup++;
+
+        /*
+         * Visit the query's RTE nodes after their contents; otherwise
+         * query_tree_walker would descend into the newly inlined CTE query,
+         * which we don't want.
+         */
+        (void) query_tree_walker(query, inline_cte_walker, context,
+                                 QTW_EXAMINE_RTES_AFTER);
+
+        context->levelsup--;
+
+        return false;
+    }
+    else if (IsA(node, RangeTblEntry))
+    {
+        RangeTblEntry *rte = (RangeTblEntry *) node;
+
+        if (rte->rtekind == RTE_CTE &&
+            strcmp(rte->ctename, context->ctename) == 0 &&
+            rte->ctelevelsup == context->levelsup)
+        {
+            /*
+             * Found a reference to replace.  Generate a copy of the CTE query
+             * with appropriate level adjustment for outer references (e.g.,
+             * to other CTEs).
+             */
+            Query       *newquery = copyObject(context->ctequery);
+
+            if (context->levelsup > 0)
+                IncrementVarSublevelsUp((Node *) newquery, context->levelsup, 1);
+
+            /*
+             * Convert the RTE_CTE RTE into a RTE_SUBQUERY.
+             *
+             * Historically, a FOR UPDATE clause has been treated as extending
+             * into views and subqueries, but not into CTEs.  We preserve this
+             * distinction by not trying to push rowmarks into the new
+             * subquery.
+             */
+            rte->rtekind = RTE_SUBQUERY;
+            rte->subquery = newquery;
+            rte->security_barrier = false;
+
+            /* Zero out CTE-specific fields */
+            rte->ctename = NULL;
+            rte->ctelevelsup = 0;
+            rte->self_reference = false;
+            rte->coltypes = NIL;
+            rte->coltypmods = NIL;
+            rte->colcollations = NIL;
+
+            /* Count the number of replacements we've done */
+            context->refcount--;
+        }
+
+        return false;
+    }
+
+    return expression_tree_walker(node, inline_cte_walker, context);
+}
+
+
+/*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
  *
  * The caller has found an ANY SubLink at the top level of one of the query's
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf41..5ffdd8f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -518,7 +518,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                 Bit ConstBit BitWithLength BitWithoutLength
 %type <str>        character
 %type <str>        extract_arg
-%type <boolean> opt_varying opt_timezone opt_no_inherit
+%type <boolean> opt_varying opt_timezone opt_no_inherit opt_materialized

 %type <ival>    Iconst SignedIconst
 %type <str>        Sconst comment_text notify_payload
@@ -11322,17 +11322,23 @@ cte_list:
         | cte_list ',' common_table_expr        { $$ = lappend($1, $3); }
         ;

-common_table_expr:  name opt_name_list AS '(' PreparableStmt ')'
+common_table_expr:  name opt_name_list AS opt_materialized '(' PreparableStmt ')'
             {
                 CommonTableExpr *n = makeNode(CommonTableExpr);
                 n->ctename = $1;
                 n->aliascolnames = $2;
-                n->ctequery = $5;
+                n->ctematerialized = $4;
+                n->ctequery = $6;
                 n->location = @1;
                 $$ = (Node *) n;
             }
         ;

+opt_materialized:
+        MATERIALIZED                            { $$ = true; }
+        | /*EMPTY*/                                { $$ = false; }
+        ;
+
 opt_with_clause:
         with_clause                                { $$ = $1; }
         | /*EMPTY*/                                { $$ = NULL; }
@@ -16214,6 +16220,7 @@ makeRecursiveViewSelect(char *relname, List *aliases, Node *query)
     /* create common table expression */
     cte->ctename = relname;
     cte->aliascolnames = aliases;
+    cte->ctematerialized = false;
     cte->ctequery = query;
     cte->location = -1;

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 302df16..eb0010d 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5166,7 +5166,10 @@ get_with_clause(Query *query, deparse_context *context)
             }
             appendStringInfoChar(buf, ')');
         }
-        appendStringInfoString(buf, " AS (");
+        appendStringInfoString(buf, " AS ");
+        if (cte->ctematerialized)
+            appendStringInfoString(buf, "MATERIALIZED ");
+        appendStringInfoChar(buf, '(');
         if (PRETTY_INDENT(context))
             appendContextKeyword(context, "", 0, 0, 0);
         get_query_def((Query *) cte->ctequery, buf, context->namespaces, NULL,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index addc2c2..136f72c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1404,6 +1404,7 @@ typedef struct CommonTableExpr
     NodeTag        type;
     char       *ctename;        /* query name (never qualified) */
     List       *aliascolnames;    /* optional list of column names */
+    bool        ctematerialized;    /* is this an optimization fence? */
     /* SelectStmt/InsertStmt/etc before parse analysis, Query afterwards: */
     Node       *ctequery;        /* the CTE's subquery */
     int            location;        /* token location, or -1 if unknown */
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 1d12b01..2e17049 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -2179,7 +2179,7 @@ EXPLAIN (COSTS OFF) EXECUTE plancache_test;
    Filter: (((a % 2) = 0) AND f_leak(b))
 (2 rows)

-PREPARE plancache_test2 AS WITH q AS (SELECT * FROM z1 WHERE f_leak(b)) SELECT * FROM q,z2;
+PREPARE plancache_test2 AS WITH q AS MATERIALIZED (SELECT * FROM z1 WHERE f_leak(b)) SELECT * FROM q,z2;
 EXPLAIN (COSTS OFF) EXECUTE plancache_test2;
                    QUERY PLAN
 -------------------------------------------------
@@ -2192,7 +2192,7 @@ EXPLAIN (COSTS OFF) EXECUTE plancache_test2;
          ->  Seq Scan on z2
 (7 rows)

-PREPARE plancache_test3 AS WITH q AS (SELECT * FROM z2) SELECT * FROM q,z1 WHERE f_leak(z1.b);
+PREPARE plancache_test3 AS WITH q AS MATERIALIZED (SELECT * FROM z2) SELECT * FROM q,z1 WHERE f_leak(z1.b);
 EXPLAIN (COSTS OFF) EXECUTE plancache_test3;
                      QUERY PLAN
 -----------------------------------------------------
@@ -2826,7 +2826,7 @@ ALTER TABLE t1 ENABLE ROW LEVEL SECURITY;
 GRANT ALL ON t1 TO regress_rls_bob;
 INSERT INTO t1 (SELECT x, md5(x::text) FROM generate_series(0,20) x);
 SET SESSION AUTHORIZATION regress_rls_bob;
-WITH cte1 AS (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
+WITH cte1 AS MATERIALIZED (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
 NOTICE:  f_leak => cfcd208495d565ef66e7dff9f98764da
 NOTICE:  f_leak => c81e728d9d4c2f636f067f89cc14862c
 NOTICE:  f_leak => a87ff679a2f3e71d9181a67b7542122c
@@ -2853,7 +2853,8 @@ NOTICE:  f_leak => 98f13708210194c475687be6106a3b84
  20 | 98f13708210194c475687be6106a3b84
 (11 rows)

-EXPLAIN (COSTS OFF) WITH cte1 AS (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
+EXPLAIN (COSTS OFF)
+WITH cte1 AS MATERIALIZED (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
                    QUERY PLAN
 -------------------------------------------------
  CTE Scan on cte1
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index d6a1a33..14757b4 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -1037,7 +1037,7 @@ from (values (1,row(1,2)), (1,row(null,null)), (1,null),
 (6 rows)

 explain (verbose, costs off)
-with r(a,b) as
+with r(a,b) as materialized
   (values (1,row(1,2)), (1,row(null,null)), (1,null),
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;
@@ -1050,7 +1050,7 @@ select r, r is null as isnull, r is not null as isnotnull from r;
            Output: "*VALUES*".column1, "*VALUES*".column2
 (5 rows)

-with r(a,b) as
+with r(a,b) as materialized
   (values (1,row(1,2)), (1,row(null,null)), (1,null),
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e384cd2..bb10e27 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -3108,7 +3108,7 @@ explain (costs off) INSERT INTO hats VALUES ('h8', 'forbidden') RETURNING *;
 (5 rows)

 -- ensure upserting into a rule, with a CTE (different offsets!) works
-WITH data(hat_name, hat_color) AS (
+WITH data(hat_name, hat_color) AS MATERIALIZED (
     VALUES ('h8', 'green'),
         ('h9', 'blue'),
         ('h7', 'forbidden')
@@ -3122,7 +3122,8 @@ RETURNING *;
  h9         | blue
 (2 rows)

-EXPLAIN (costs off) WITH data(hat_name, hat_color) AS (
+EXPLAIN (costs off)
+WITH data(hat_name, hat_color) AS MATERIALIZED (
     VALUES ('h8', 'green'),
         ('h9', 'blue'),
         ('h7', 'forbidden')
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 588d069..e2afd6f 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1154,3 +1154,117 @@ fetch backward all in c1;
 (2 rows)

 commit;
+--
+-- Tests for CTE inlining behavior
+--
+-- Basic subquery that can be inlined
+explain (verbose, costs off)
+with x as (select * from (select f1 from subselect_tbl) ss)
+select * from x where f1 = 1;
+            QUERY PLAN
+----------------------------------
+ Seq Scan on public.subselect_tbl
+   Output: subselect_tbl.f1
+   Filter: (subselect_tbl.f1 = 1)
+(3 rows)
+
+-- Explicitly request materialization
+explain (verbose, costs off)
+with x as materialized (select * from (select f1 from subselect_tbl) ss)
+select * from x where f1 = 1;
+                QUERY PLAN
+------------------------------------------
+ CTE Scan on x
+   Output: x.f1
+   Filter: (x.f1 = 1)
+   CTE x
+     ->  Seq Scan on public.subselect_tbl
+           Output: subselect_tbl.f1
+(6 rows)
+
+-- Stable functions are safe to inline
+explain (verbose, costs off)
+with x as (select * from (select f1, now() from subselect_tbl) ss)
+select * from x where f1 = 1;
+            QUERY PLAN
+-----------------------------------
+ Seq Scan on public.subselect_tbl
+   Output: subselect_tbl.f1, now()
+   Filter: (subselect_tbl.f1 = 1)
+(3 rows)
+
+-- Volatile functions prevent inlining
+explain (verbose, costs off)
+with x as (select * from (select f1, random() from subselect_tbl) ss)
+select * from x where f1 = 1;
+                  QUERY PLAN
+----------------------------------------------
+ CTE Scan on x
+   Output: x.f1, x.random
+   Filter: (x.f1 = 1)
+   CTE x
+     ->  Seq Scan on public.subselect_tbl
+           Output: subselect_tbl.f1, random()
+(6 rows)
+
+-- SELECT FOR UPDATE cannot be inlined
+explain (verbose, costs off)
+with x as (select * from (select f1 from subselect_tbl for update) ss)
+select * from x where f1 = 1;
+                             QUERY PLAN
+--------------------------------------------------------------------
+ CTE Scan on x
+   Output: x.f1
+   Filter: (x.f1 = 1)
+   CTE x
+     ->  Subquery Scan on ss
+           Output: ss.f1
+           ->  LockRows
+                 Output: subselect_tbl.f1, subselect_tbl.ctid
+                 ->  Seq Scan on public.subselect_tbl
+                       Output: subselect_tbl.f1, subselect_tbl.ctid
+(10 rows)
+
+-- Check handling of outer references
+explain (verbose, costs off)
+with x as (select * from int4_tbl)
+select * from (with y as (select * from x) select * from y) ss;
+         QUERY PLAN
+-----------------------------
+ Seq Scan on public.int4_tbl
+   Output: int4_tbl.f1
+(2 rows)
+
+explain (verbose, costs off)
+with x as materialized (select * from int4_tbl)
+select * from (with y as (select * from x) select * from y) ss;
+             QUERY PLAN
+-------------------------------------
+ CTE Scan on x
+   Output: x.f1
+   CTE x
+     ->  Seq Scan on public.int4_tbl
+           Output: int4_tbl.f1
+(5 rows)
+
+-- Ensure that we inline the currect CTE when there are
+-- multiple CTEs with the same name
+explain (verbose, costs off)
+with x as (select 1 as y)
+select * from (with x as (select 2 as y) select * from x) ss;
+ QUERY PLAN
+-------------
+ Result
+   Output: 2
+(2 rows)
+
+-- Row marks are not pushed into CTEs
+explain (verbose, costs off)
+with x as (select * from subselect_tbl)
+select * from x for update;
+                           QUERY PLAN
+----------------------------------------------------------------
+ Seq Scan on public.subselect_tbl
+   Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
+(2 rows)
+
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index 38e9b38..52da276 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -840,10 +840,10 @@ EXPLAIN (COSTS OFF) SELECT * FROM z1 WHERE f_leak(b);
 PREPARE plancache_test AS SELECT * FROM z1 WHERE f_leak(b);
 EXPLAIN (COSTS OFF) EXECUTE plancache_test;

-PREPARE plancache_test2 AS WITH q AS (SELECT * FROM z1 WHERE f_leak(b)) SELECT * FROM q,z2;
+PREPARE plancache_test2 AS WITH q AS MATERIALIZED (SELECT * FROM z1 WHERE f_leak(b)) SELECT * FROM q,z2;
 EXPLAIN (COSTS OFF) EXECUTE plancache_test2;

-PREPARE plancache_test3 AS WITH q AS (SELECT * FROM z2) SELECT * FROM q,z1 WHERE f_leak(z1.b);
+PREPARE plancache_test3 AS WITH q AS MATERIALIZED (SELECT * FROM z2) SELECT * FROM q,z1 WHERE f_leak(z1.b);
 EXPLAIN (COSTS OFF) EXECUTE plancache_test3;

 SET ROLE regress_rls_group1;
@@ -1071,8 +1071,9 @@ INSERT INTO t1 (SELECT x, md5(x::text) FROM generate_series(0,20) x);

 SET SESSION AUTHORIZATION regress_rls_bob;

-WITH cte1 AS (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
-EXPLAIN (COSTS OFF) WITH cte1 AS (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
+WITH cte1 AS MATERIALIZED (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;
+EXPLAIN (COSTS OFF)
+WITH cte1 AS MATERIALIZED (SELECT * FROM t1 WHERE f_leak(b)) SELECT * FROM cte1;

 WITH cte1 AS (UPDATE t1 SET a = a + 1 RETURNING *) SELECT * FROM cte1; --fail
 WITH cte1 AS (UPDATE t1 SET a = a RETURNING *) SELECT * FROM cte1; --ok
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index e6d3898..b3b0774 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -428,12 +428,12 @@ from (values (1,row(1,2)), (1,row(null,null)), (1,null),
              (null,row(1,2)), (null,row(null,null)), (null,null) ) r(a,b);

 explain (verbose, costs off)
-with r(a,b) as
+with r(a,b) as materialized
   (values (1,row(1,2)), (1,row(null,null)), (1,null),
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;

-with r(a,b) as
+with r(a,b) as materialized
   (values (1,row(1,2)), (1,row(null,null)), (1,null),
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index f4ee30e..6666951 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -1132,7 +1132,7 @@ SELECT tablename, rulename, definition FROM pg_rules
 explain (costs off) INSERT INTO hats VALUES ('h8', 'forbidden') RETURNING *;

 -- ensure upserting into a rule, with a CTE (different offsets!) works
-WITH data(hat_name, hat_color) AS (
+WITH data(hat_name, hat_color) AS MATERIALIZED (
     VALUES ('h8', 'green'),
         ('h9', 'blue'),
         ('h7', 'forbidden')
@@ -1140,7 +1140,8 @@ WITH data(hat_name, hat_color) AS (
 INSERT INTO hats
     SELECT * FROM data
 RETURNING *;
-EXPLAIN (costs off) WITH data(hat_name, hat_color) AS (
+EXPLAIN (costs off)
+WITH data(hat_name, hat_color) AS MATERIALIZED (
     VALUES ('h8', 'green'),
         ('h9', 'blue'),
         ('h7', 'forbidden')
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 843f511..3dba35e 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -625,3 +625,52 @@ move forward all in c1;
 fetch backward all in c1;

 commit;
+
+--
+-- Tests for CTE inlining behavior
+--
+
+-- Basic subquery that can be inlined
+explain (verbose, costs off)
+with x as (select * from (select f1 from subselect_tbl) ss)
+select * from x where f1 = 1;
+
+-- Explicitly request materialization
+explain (verbose, costs off)
+with x as materialized (select * from (select f1 from subselect_tbl) ss)
+select * from x where f1 = 1;
+
+-- Stable functions are safe to inline
+explain (verbose, costs off)
+with x as (select * from (select f1, now() from subselect_tbl) ss)
+select * from x where f1 = 1;
+
+-- Volatile functions prevent inlining
+explain (verbose, costs off)
+with x as (select * from (select f1, random() from subselect_tbl) ss)
+select * from x where f1 = 1;
+
+-- SELECT FOR UPDATE cannot be inlined
+explain (verbose, costs off)
+with x as (select * from (select f1 from subselect_tbl for update) ss)
+select * from x where f1 = 1;
+
+-- Check handling of outer references
+explain (verbose, costs off)
+with x as (select * from int4_tbl)
+select * from (with y as (select * from x) select * from y) ss;
+
+explain (verbose, costs off)
+with x as materialized (select * from int4_tbl)
+select * from (with y as (select * from x) select * from y) ss;
+
+-- Ensure that we inline the currect CTE when there are
+-- multiple CTEs with the same name
+explain (verbose, costs off)
+with x as (select 1 as y)
+select * from (with x as (select 2 as y) select * from x) ss;
+
+-- Row marks are not pushed into CTEs
+explain (verbose, costs off)
+with x as (select * from subselect_tbl)
+select * from x for update;

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY
Next
From: Merlin Moncure
Date:
Subject: Re: crosstab/repivot...any interest?