Re: Using Expanded Objects other than Arrays from plpgsql - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Using Expanded Objects other than Arrays from plpgsql
Date
Msg-id 256915.1738533419@sss.pgh.pa.us
Whole thread Raw
In response to Re: Using Expanded Objects other than Arrays from plpgsql  (Michel Pelletier <pelletier.michel@gmail.com>)
Responses Re: Using Expanded Objects other than Arrays from plpgsql
Re: Using Expanded Objects other than Arrays from plpgsql
List pgsql-hackers
I wrote:
> Hmm, it seemed to still apply for me.  But anyway, I needed to make
> the other changes, so here's v4.

I decided to see what would happen if we tried to avoid the code
duplication in pl_funcs.c by making some "walker" infrastructure
akin to expression_tree_walker.  While that doesn't seem useful
for the dump_xxx functions, it works very nicely for the free_xxx
functions and now for the mark_xxx ones as well.  pl_funcs.c
nets out about 400 lines shorter than in the v4 patch.  The
code coverage score for the file is still awful :-(, but that's
because we're not testing the dump_xxx functions at all.

PFA v5.  The new 0001 patch refactors the free_xxx infrastructure
to create plpgsql_statement_tree_walker(), and then in what's now
0003 we can use that instead of writing a lot of duplicate code.

            regards, tom lane

From bec67b472a0f9b237c5ed1feffd01ee4428b0688 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 2 Feb 2025 16:05:01 -0500
Subject: [PATCH v5 1/5] Refactor pl_funcs.c to provide a usage-independent
 tree walker.

We haven't done this up to now because there was only one use-case,
namely plpgsql_free_function_memory's search for expressions to clean
up.  However an upcoming patch has another need for walking plpgsql
functions' statement trees, so let's create sharable tree-walker
infrastructure in the same style as expression_tree_walker().

This patch actually makes the code shorter, although that's
mainly down to having used a more compact coding style.  (I didn't
write a separate subroutine for each statement type, and I made
use of some newer notations like foreach_ptr.)

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
---
 src/pl/plpgsql/src/pl_funcs.c | 582 ++++++++++++++--------------------
 1 file changed, 244 insertions(+), 338 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c
index 8c827fe5cc..88e25b54bc 100644
--- a/src/pl/plpgsql/src/pl_funcs.c
+++ b/src/pl/plpgsql/src/pl_funcs.c
@@ -334,387 +334,291 @@ plpgsql_getdiag_kindname(PLpgSQL_getdiag_kind kind)


 /**********************************************************************
- * Release memory when a PL/pgSQL function is no longer needed
+ * Support for recursing through a PL/pgSQL statement tree
  *
- * The code for recursing through the function tree is really only
- * needed to locate PLpgSQL_expr nodes, which may contain references
- * to saved SPI Plans that must be freed.  The function tree itself,
- * along with subsidiary data, is freed in one swoop by freeing the
- * function's permanent memory context.
+ * The point of this code is to encapsulate knowledge of where the
+ * sub-statements and expressions are in a statement tree, avoiding
+ * duplication of code.  The caller supplies two callbacks, one to
+ * be invoked on statements and one to be invoked on expressions.
+ * (The recursion should be started by invoking the statement callback
+ * on function->action.)  The statement callback should do any
+ * statement-type-specific action it needs, then recurse by calling
+ * plpgsql_statement_tree_walker().  The expression callback can be a
+ * no-op if no per-expression behavior is needed.
  **********************************************************************/
-static void free_stmt(PLpgSQL_stmt *stmt);
-static void free_block(PLpgSQL_stmt_block *block);
-static void free_assign(PLpgSQL_stmt_assign *stmt);
-static void free_if(PLpgSQL_stmt_if *stmt);
-static void free_case(PLpgSQL_stmt_case *stmt);
-static void free_loop(PLpgSQL_stmt_loop *stmt);
-static void free_while(PLpgSQL_stmt_while *stmt);
-static void free_fori(PLpgSQL_stmt_fori *stmt);
-static void free_fors(PLpgSQL_stmt_fors *stmt);
-static void free_forc(PLpgSQL_stmt_forc *stmt);
-static void free_foreach_a(PLpgSQL_stmt_foreach_a *stmt);
-static void free_exit(PLpgSQL_stmt_exit *stmt);
-static void free_return(PLpgSQL_stmt_return *stmt);
-static void free_return_next(PLpgSQL_stmt_return_next *stmt);
-static void free_return_query(PLpgSQL_stmt_return_query *stmt);
-static void free_raise(PLpgSQL_stmt_raise *stmt);
-static void free_assert(PLpgSQL_stmt_assert *stmt);
-static void free_execsql(PLpgSQL_stmt_execsql *stmt);
-static void free_dynexecute(PLpgSQL_stmt_dynexecute *stmt);
-static void free_dynfors(PLpgSQL_stmt_dynfors *stmt);
-static void free_getdiag(PLpgSQL_stmt_getdiag *stmt);
-static void free_open(PLpgSQL_stmt_open *stmt);
-static void free_fetch(PLpgSQL_stmt_fetch *stmt);
-static void free_close(PLpgSQL_stmt_close *stmt);
-static void free_perform(PLpgSQL_stmt_perform *stmt);
-static void free_call(PLpgSQL_stmt_call *stmt);
-static void free_commit(PLpgSQL_stmt_commit *stmt);
-static void free_rollback(PLpgSQL_stmt_rollback *stmt);
-static void free_expr(PLpgSQL_expr *expr);
+typedef void (*plpgsql_stmt_walker_callback) (PLpgSQL_stmt *stmt,
+                                              void *context);
+typedef void (*plpgsql_expr_walker_callback) (PLpgSQL_expr *expr,
+                                              void *context);

+/*
+ * As in nodeFuncs.h, we respectfully decline to support the C standard's
+ * position that a pointer to struct is incompatible with "void *".  Instead,
+ * silence related compiler warnings using casts in this macro wrapper.
+ */
+#define plpgsql_statement_tree_walker(s, sw, ew, c) \
+    plpgsql_statement_tree_walker_impl(s, (plpgsql_stmt_walker_callback) (sw), \
+                                       (plpgsql_expr_walker_callback) (ew), c)

 static void
-free_stmt(PLpgSQL_stmt *stmt)
+plpgsql_statement_tree_walker_impl(PLpgSQL_stmt *stmt,
+                                   plpgsql_stmt_walker_callback stmt_callback,
+                                   plpgsql_expr_walker_callback expr_callback,
+                                   void *context)
 {
+#define S_WALK(st) stmt_callback(st, context)
+#define E_WALK(ex) expr_callback(ex, context)
+#define S_LIST_WALK(lst) foreach_ptr(PLpgSQL_stmt, st, lst) S_WALK(st)
+#define E_LIST_WALK(lst) foreach_ptr(PLpgSQL_expr, ex, lst) E_WALK(ex)
+
     switch (stmt->cmd_type)
     {
         case PLPGSQL_STMT_BLOCK:
-            free_block((PLpgSQL_stmt_block *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_block *bstmt = (PLpgSQL_stmt_block *) stmt;
+
+                S_LIST_WALK(bstmt->body);
+                if (bstmt->exceptions)
+                {
+                    foreach_ptr(PLpgSQL_exception, exc, bstmt->exceptions->exc_list)
+                    {
+                        /* conditions list has no interesting sub-structure */
+                        S_LIST_WALK(exc->action);
+                    }
+                }
+                break;
+            }
         case PLPGSQL_STMT_ASSIGN:
-            free_assign((PLpgSQL_stmt_assign *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_assign *astmt = (PLpgSQL_stmt_assign *) stmt;
+
+                E_WALK(astmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_IF:
-            free_if((PLpgSQL_stmt_if *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_if *ifstmt = (PLpgSQL_stmt_if *) stmt;
+
+                E_WALK(ifstmt->cond);
+                S_LIST_WALK(ifstmt->then_body);
+                foreach_ptr(PLpgSQL_if_elsif, elif, ifstmt->elsif_list)
+                {
+                    E_WALK(elif->cond);
+                    S_LIST_WALK(elif->stmts);
+                }
+                S_LIST_WALK(ifstmt->else_body);
+                break;
+            }
         case PLPGSQL_STMT_CASE:
-            free_case((PLpgSQL_stmt_case *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_case *cstmt = (PLpgSQL_stmt_case *) stmt;
+
+                E_WALK(cstmt->t_expr);
+                foreach_ptr(PLpgSQL_case_when, cwt, cstmt->case_when_list)
+                {
+                    E_WALK(cwt->expr);
+                    S_LIST_WALK(cwt->stmts);
+                }
+                S_LIST_WALK(cstmt->else_stmts);
+                break;
+            }
         case PLPGSQL_STMT_LOOP:
-            free_loop((PLpgSQL_stmt_loop *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_loop *lstmt = (PLpgSQL_stmt_loop *) stmt;
+
+                S_LIST_WALK(lstmt->body);
+                break;
+            }
         case PLPGSQL_STMT_WHILE:
-            free_while((PLpgSQL_stmt_while *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_while *wstmt = (PLpgSQL_stmt_while *) stmt;
+
+                E_WALK(wstmt->cond);
+                S_LIST_WALK(wstmt->body);
+                break;
+            }
         case PLPGSQL_STMT_FORI:
-            free_fori((PLpgSQL_stmt_fori *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_fori *fori = (PLpgSQL_stmt_fori *) stmt;
+
+                E_WALK(fori->lower);
+                E_WALK(fori->upper);
+                E_WALK(fori->step);
+                S_LIST_WALK(fori->body);
+                break;
+            }
         case PLPGSQL_STMT_FORS:
-            free_fors((PLpgSQL_stmt_fors *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_fors *fors = (PLpgSQL_stmt_fors *) stmt;
+
+                S_LIST_WALK(fors->body);
+                E_WALK(fors->query);
+                break;
+            }
         case PLPGSQL_STMT_FORC:
-            free_forc((PLpgSQL_stmt_forc *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_forc *forc = (PLpgSQL_stmt_forc *) stmt;
+
+                S_LIST_WALK(forc->body);
+                E_WALK(forc->argquery);
+                break;
+            }
         case PLPGSQL_STMT_FOREACH_A:
-            free_foreach_a((PLpgSQL_stmt_foreach_a *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_foreach_a *fstmt = (PLpgSQL_stmt_foreach_a *) stmt;
+
+                E_WALK(fstmt->expr);
+                S_LIST_WALK(fstmt->body);
+                break;
+            }
         case PLPGSQL_STMT_EXIT:
-            free_exit((PLpgSQL_stmt_exit *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_exit *estmt = (PLpgSQL_stmt_exit *) stmt;
+
+                E_WALK(estmt->cond);
+                break;
+            }
         case PLPGSQL_STMT_RETURN:
-            free_return((PLpgSQL_stmt_return *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_return *rstmt = (PLpgSQL_stmt_return *) stmt;
+
+                E_WALK(rstmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_RETURN_NEXT:
-            free_return_next((PLpgSQL_stmt_return_next *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_return_next *rstmt = (PLpgSQL_stmt_return_next *) stmt;
+
+                E_WALK(rstmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_RETURN_QUERY:
-            free_return_query((PLpgSQL_stmt_return_query *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_return_query *rstmt = (PLpgSQL_stmt_return_query *) stmt;
+
+                E_WALK(rstmt->query);
+                E_WALK(rstmt->dynquery);
+                E_LIST_WALK(rstmt->params);
+                break;
+            }
         case PLPGSQL_STMT_RAISE:
-            free_raise((PLpgSQL_stmt_raise *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_raise *rstmt = (PLpgSQL_stmt_raise *) stmt;
+
+                E_LIST_WALK(rstmt->params);
+                foreach_ptr(PLpgSQL_raise_option, opt, rstmt->options)
+                {
+                    E_WALK(opt->expr);
+                }
+                break;
+            }
         case PLPGSQL_STMT_ASSERT:
-            free_assert((PLpgSQL_stmt_assert *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_assert *astmt = (PLpgSQL_stmt_assert *) stmt;
+
+                E_WALK(astmt->cond);
+                E_WALK(astmt->message);
+                break;
+            }
         case PLPGSQL_STMT_EXECSQL:
-            free_execsql((PLpgSQL_stmt_execsql *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_execsql *xstmt = (PLpgSQL_stmt_execsql *) stmt;
+
+                E_WALK(xstmt->sqlstmt);
+                break;
+            }
         case PLPGSQL_STMT_DYNEXECUTE:
-            free_dynexecute((PLpgSQL_stmt_dynexecute *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_dynexecute *dstmt = (PLpgSQL_stmt_dynexecute *) stmt;
+
+                E_WALK(dstmt->query);
+                E_LIST_WALK(dstmt->params);
+                break;
+            }
         case PLPGSQL_STMT_DYNFORS:
-            free_dynfors((PLpgSQL_stmt_dynfors *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_dynfors *dstmt = (PLpgSQL_stmt_dynfors *) stmt;
+
+                S_LIST_WALK(dstmt->body);
+                E_WALK(dstmt->query);
+                E_LIST_WALK(dstmt->params);
+                break;
+            }
         case PLPGSQL_STMT_GETDIAG:
-            free_getdiag((PLpgSQL_stmt_getdiag *) stmt);
-            break;
+            {
+                /* no interesting sub-structure */
+                break;
+            }
         case PLPGSQL_STMT_OPEN:
-            free_open((PLpgSQL_stmt_open *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_open *ostmt = (PLpgSQL_stmt_open *) stmt;
+
+                E_WALK(ostmt->argquery);
+                E_WALK(ostmt->query);
+                E_WALK(ostmt->dynquery);
+                E_LIST_WALK(ostmt->params);
+                break;
+            }
         case PLPGSQL_STMT_FETCH:
-            free_fetch((PLpgSQL_stmt_fetch *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_fetch *fstmt = (PLpgSQL_stmt_fetch *) stmt;
+
+                E_WALK(fstmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_CLOSE:
-            free_close((PLpgSQL_stmt_close *) stmt);
-            break;
+            {
+                /* no interesting sub-structure */
+                break;
+            }
         case PLPGSQL_STMT_PERFORM:
-            free_perform((PLpgSQL_stmt_perform *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_perform *pstmt = (PLpgSQL_stmt_perform *) stmt;
+
+                E_WALK(pstmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_CALL:
-            free_call((PLpgSQL_stmt_call *) stmt);
-            break;
+            {
+                PLpgSQL_stmt_call *cstmt = (PLpgSQL_stmt_call *) stmt;
+
+                E_WALK(cstmt->expr);
+                break;
+            }
         case PLPGSQL_STMT_COMMIT:
-            free_commit((PLpgSQL_stmt_commit *) stmt);
-            break;
         case PLPGSQL_STMT_ROLLBACK:
-            free_rollback((PLpgSQL_stmt_rollback *) stmt);
-            break;
+            {
+                /* no interesting sub-structure */
+                break;
+            }
         default:
             elog(ERROR, "unrecognized cmd_type: %d", stmt->cmd_type);
             break;
     }
 }

-static void
-free_stmts(List *stmts)
-{
-    ListCell   *s;
-
-    foreach(s, stmts)
-    {
-        free_stmt((PLpgSQL_stmt *) lfirst(s));
-    }
-}
-
-static void
-free_block(PLpgSQL_stmt_block *block)
-{
-    free_stmts(block->body);
-    if (block->exceptions)
-    {
-        ListCell   *e;
-
-        foreach(e, block->exceptions->exc_list)
-        {
-            PLpgSQL_exception *exc = (PLpgSQL_exception *) lfirst(e);
-
-            free_stmts(exc->action);
-        }
-    }
-}
-
-static void
-free_assign(PLpgSQL_stmt_assign *stmt)
-{
-    free_expr(stmt->expr);
-}
-
-static void
-free_if(PLpgSQL_stmt_if *stmt)
-{
-    ListCell   *l;
-
-    free_expr(stmt->cond);
-    free_stmts(stmt->then_body);
-    foreach(l, stmt->elsif_list)
-    {
-        PLpgSQL_if_elsif *elif = (PLpgSQL_if_elsif *) lfirst(l);
-
-        free_expr(elif->cond);
-        free_stmts(elif->stmts);
-    }
-    free_stmts(stmt->else_body);
-}
-
-static void
-free_case(PLpgSQL_stmt_case *stmt)
-{
-    ListCell   *l;
-
-    free_expr(stmt->t_expr);
-    foreach(l, stmt->case_when_list)
-    {
-        PLpgSQL_case_when *cwt = (PLpgSQL_case_when *) lfirst(l);
-
-        free_expr(cwt->expr);
-        free_stmts(cwt->stmts);
-    }
-    free_stmts(stmt->else_stmts);
-}
-
-static void
-free_loop(PLpgSQL_stmt_loop *stmt)
-{
-    free_stmts(stmt->body);
-}
-
-static void
-free_while(PLpgSQL_stmt_while *stmt)
-{
-    free_expr(stmt->cond);
-    free_stmts(stmt->body);
-}
-
-static void
-free_fori(PLpgSQL_stmt_fori *stmt)
-{
-    free_expr(stmt->lower);
-    free_expr(stmt->upper);
-    free_expr(stmt->step);
-    free_stmts(stmt->body);
-}
-
-static void
-free_fors(PLpgSQL_stmt_fors *stmt)
-{
-    free_stmts(stmt->body);
-    free_expr(stmt->query);
-}
-
-static void
-free_forc(PLpgSQL_stmt_forc *stmt)
-{
-    free_stmts(stmt->body);
-    free_expr(stmt->argquery);
-}
-
-static void
-free_foreach_a(PLpgSQL_stmt_foreach_a *stmt)
-{
-    free_expr(stmt->expr);
-    free_stmts(stmt->body);
-}
-
-static void
-free_open(PLpgSQL_stmt_open *stmt)
-{
-    ListCell   *lc;
-
-    free_expr(stmt->argquery);
-    free_expr(stmt->query);
-    free_expr(stmt->dynquery);
-    foreach(lc, stmt->params)
-    {
-        free_expr((PLpgSQL_expr *) lfirst(lc));
-    }
-}
-
-static void
-free_fetch(PLpgSQL_stmt_fetch *stmt)
-{
-    free_expr(stmt->expr);
-}

-static void
-free_close(PLpgSQL_stmt_close *stmt)
-{
-}
-
-static void
-free_perform(PLpgSQL_stmt_perform *stmt)
-{
-    free_expr(stmt->expr);
-}
-
-static void
-free_call(PLpgSQL_stmt_call *stmt)
-{
-    free_expr(stmt->expr);
-}
-
-static void
-free_commit(PLpgSQL_stmt_commit *stmt)
-{
-}
-
-static void
-free_rollback(PLpgSQL_stmt_rollback *stmt)
-{
-}
-
-static void
-free_exit(PLpgSQL_stmt_exit *stmt)
-{
-    free_expr(stmt->cond);
-}
-
-static void
-free_return(PLpgSQL_stmt_return *stmt)
-{
-    free_expr(stmt->expr);
-}
-
-static void
-free_return_next(PLpgSQL_stmt_return_next *stmt)
-{
-    free_expr(stmt->expr);
-}
-
-static void
-free_return_query(PLpgSQL_stmt_return_query *stmt)
-{
-    ListCell   *lc;
-
-    free_expr(stmt->query);
-    free_expr(stmt->dynquery);
-    foreach(lc, stmt->params)
-    {
-        free_expr((PLpgSQL_expr *) lfirst(lc));
-    }
-}
-
-static void
-free_raise(PLpgSQL_stmt_raise *stmt)
-{
-    ListCell   *lc;
-
-    foreach(lc, stmt->params)
-    {
-        free_expr((PLpgSQL_expr *) lfirst(lc));
-    }
-    foreach(lc, stmt->options)
-    {
-        PLpgSQL_raise_option *opt = (PLpgSQL_raise_option *) lfirst(lc);
-
-        free_expr(opt->expr);
-    }
-}
-
-static void
-free_assert(PLpgSQL_stmt_assert *stmt)
-{
-    free_expr(stmt->cond);
-    free_expr(stmt->message);
-}
-
-static void
-free_execsql(PLpgSQL_stmt_execsql *stmt)
-{
-    free_expr(stmt->sqlstmt);
-}
-
-static void
-free_dynexecute(PLpgSQL_stmt_dynexecute *stmt)
-{
-    ListCell   *lc;
-
-    free_expr(stmt->query);
-    foreach(lc, stmt->params)
-    {
-        free_expr((PLpgSQL_expr *) lfirst(lc));
-    }
-}
-
-static void
-free_dynfors(PLpgSQL_stmt_dynfors *stmt)
-{
-    ListCell   *lc;
-
-    free_stmts(stmt->body);
-    free_expr(stmt->query);
-    foreach(lc, stmt->params)
-    {
-        free_expr((PLpgSQL_expr *) lfirst(lc));
-    }
-}
+/**********************************************************************
+ * Release memory when a PL/pgSQL function is no longer needed
+ *
+ * This code only needs to deal with cleaning up PLpgSQL_expr nodes,
+ * which may contain references to saved SPI Plans that must be freed.
+ * The function tree itself, along with subsidiary data, is freed in
+ * one swoop by freeing the function's permanent memory context.
+ **********************************************************************/
+static void free_stmt(PLpgSQL_stmt *stmt, void *context);
+static void free_expr(PLpgSQL_expr *expr, void *context);

 static void
-free_getdiag(PLpgSQL_stmt_getdiag *stmt)
+free_stmt(PLpgSQL_stmt *stmt, void *context)
 {
+    if (stmt == NULL)
+        return;
+    plpgsql_statement_tree_walker(stmt, free_stmt, free_expr, NULL);
 }

 static void
-free_expr(PLpgSQL_expr *expr)
+free_expr(PLpgSQL_expr *expr, void *context)
 {
     if (expr && expr->plan)
     {
@@ -743,8 +647,8 @@ plpgsql_free_function_memory(PLpgSQL_function *func)
                 {
                     PLpgSQL_var *var = (PLpgSQL_var *) d;

-                    free_expr(var->default_val);
-                    free_expr(var->cursor_explicit_expr);
+                    free_expr(var->default_val, NULL);
+                    free_expr(var->cursor_explicit_expr, NULL);
                 }
                 break;
             case PLPGSQL_DTYPE_ROW:
@@ -753,7 +657,7 @@ plpgsql_free_function_memory(PLpgSQL_function *func)
                 {
                     PLpgSQL_rec *rec = (PLpgSQL_rec *) d;

-                    free_expr(rec->default_val);
+                    free_expr(rec->default_val, NULL);
                 }
                 break;
             case PLPGSQL_DTYPE_RECFIELD:
@@ -765,8 +669,7 @@ plpgsql_free_function_memory(PLpgSQL_function *func)
     func->ndatums = 0;

     /* Release plans in statement tree */
-    if (func->action)
-        free_block(func->action);
+    free_stmt((PLpgSQL_stmt *) func->action, NULL);
     func->action = NULL;

     /*
@@ -782,6 +685,9 @@ plpgsql_free_function_memory(PLpgSQL_function *func)

 /**********************************************************************
  * Debug functions for analyzing the compiled code
+ *
+ * Sadly, there doesn't seem to be any way to let plpgsql_statement_tree_walker
+ * bear some of the burden for this.
  **********************************************************************/
 static int    dump_indent;

--
2.43.5

From 9ffc27937b8d21336f20cbe7490a0d5e2f244a84 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 2 Feb 2025 16:05:38 -0500
Subject: [PATCH v5 2/5] Preliminary refactoring.

This short and boring patch simply moves the responsibility for
initializing PLpgSQL_expr.target_param into plpgsql parsing,
rather than doing it at first execution of the expr as before.
This doesn't save anything in terms of runtime, since the work was
trivial and done only once per expr anyway.  But it makes the info
available during parsing, which will be useful for the next step.

Likewise set PLpgSQL_expr.func during parsing.  According to the
comments, this was once impossible; but it's certainly possible
since we invented the plpgsql_curr_compile variable.  Again, this
saves little runtime, but it seems far cleaner conceptually.

While at it, I reordered stuff in struct PLpgSQL_expr to make it
clearer which fields are filled when, and merged some duplicative
code in pl_gram.y.

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
---
 src/pl/plpgsql/src/pl_exec.c | 27 ---------------
 src/pl/plpgsql/src/pl_gram.y | 65 ++++++++++++++++++++++++------------
 src/pl/plpgsql/src/plpgsql.h | 31 +++++++++--------
 3 files changed, 62 insertions(+), 61 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 35cda55cf9..fec1811ae1 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -4174,12 +4174,6 @@ exec_prepare_plan(PLpgSQL_execstate *estate,
     SPIPlanPtr    plan;
     SPIPrepareOptions options;

-    /*
-     * The grammar can't conveniently set expr->func while building the parse
-     * tree, so make sure it's set before parser hooks need it.
-     */
-    expr->func = estate->func;
-
     /*
      * Generate and save the plan
      */
@@ -5016,21 +5010,7 @@ exec_assign_expr(PLpgSQL_execstate *estate, PLpgSQL_datum *target,
      * If first time through, create a plan for this expression.
      */
     if (expr->plan == NULL)
-    {
-        /*
-         * Mark the expression as being an assignment source, if target is a
-         * simple variable.  (This is a bit messy, but it seems cleaner than
-         * modifying the API of exec_prepare_plan for the purpose.  We need to
-         * stash the target dno into the expr anyway, so that it will be
-         * available if we have to replan.)
-         */
-        if (target->dtype == PLPGSQL_DTYPE_VAR)
-            expr->target_param = target->dno;
-        else
-            expr->target_param = -1;    /* should be that already */
-
         exec_prepare_plan(estate, expr, 0);
-    }

     value = exec_eval_expr(estate, expr, &isnull, &valtype, &valtypmod);
     exec_assign_value(estate, target, value, isnull, valtype, valtypmod);
@@ -6282,13 +6262,6 @@ setup_param_list(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
          * that they are interrupting an active use of parameters.
          */
         paramLI->parserSetupArg = expr;
-
-        /*
-         * Also make sure this is set before parser hooks need it.  There is
-         * no need to save and restore, since the value is always correct once
-         * set.  (Should be set already, but let's be sure.)
-         */
-        expr->func = estate->func;
     }
     else
     {
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 64d2c362bf..f55aefb100 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -61,6 +61,10 @@ static    bool            tok_is_keyword(int token, union YYSTYPE *lval,
 static    void            word_is_not_variable(PLword *word, int location, yyscan_t yyscanner);
 static    void            cword_is_not_variable(PLcword *cword, int location, yyscan_t yyscanner);
 static    void            current_token_is_not_variable(int tok, YYSTYPE *yylvalp, YYLTYPE *yyllocp, yyscan_t
yyscanner);
+static    PLpgSQL_expr    *make_plpgsql_expr(const char *query,
+                                           RawParseMode parsemode);
+static    void            mark_expr_as_assignment_source(PLpgSQL_expr *expr,
+                                                       PLpgSQL_datum *target);
 static    PLpgSQL_expr    *read_sql_construct(int until,
                                             int until2,
                                             int until3,
@@ -536,6 +540,10 @@ decl_statement    : decl_varname decl_const decl_datatype decl_collate decl_notnull
                                      errmsg("variable \"%s\" must have a default value, since it's declared NOT NULL",
                                             var->refname),
                                      parser_errposition(@5)));
+
+                        if (var->default_val != NULL)
+                            mark_expr_as_assignment_source(var->default_val,
+                                                           (PLpgSQL_datum *) var);
                     }
                 | decl_varname K_ALIAS K_FOR decl_aliasitem ';'
                     {
@@ -996,6 +1004,7 @@ stmt_assign        : T_DATUM
                                                        false, true,
                                                        NULL, NULL,
                                                        &yylval, &yylloc, yyscanner);
+                        mark_expr_as_assignment_source(new->expr, $1.datum);

                         $$ = (PLpgSQL_stmt *) new;
                     }
@@ -2651,6 +2660,38 @@ current_token_is_not_variable(int tok, YYSTYPE *yylvalp, YYLTYPE *yyllocp, yysca
         yyerror(yyllocp, NULL, yyscanner, "syntax error");
 }

+/* Convenience routine to construct a PLpgSQL_expr struct */
+static PLpgSQL_expr *
+make_plpgsql_expr(const char *query,
+                  RawParseMode parsemode)
+{
+    PLpgSQL_expr *expr = palloc0(sizeof(PLpgSQL_expr));
+
+    expr->query = pstrdup(query);
+    expr->parseMode = parsemode;
+    expr->func = plpgsql_curr_compile;
+    expr->ns = plpgsql_ns_top();
+    /* might get changed later during parsing: */
+    expr->target_param = -1;
+    /* other fields are left as zeroes until first execution */
+    return expr;
+}
+
+/* Mark a PLpgSQL_expr as being the source of an assignment to target */
+static void
+mark_expr_as_assignment_source(PLpgSQL_expr *expr, PLpgSQL_datum *target)
+{
+    /*
+     * Mark the expression as being an assignment source, if target is a
+     * simple variable.  We don't currently support optimized assignments to
+     * other DTYPEs, so no need to mark in other cases.
+     */
+    if (target->dtype == PLPGSQL_DTYPE_VAR)
+        expr->target_param = target->dno;
+    else
+        expr->target_param = -1;    /* should be that already */
+}
+
 /* Convenience routine to read an expression with one possible terminator */
 static PLpgSQL_expr *
 read_sql_expression(int until, const char *expected, YYSTYPE *yylvalp, YYLTYPE *yyllocp, yyscan_t yyscanner)
@@ -2794,13 +2835,7 @@ read_sql_construct(int until,
      */
     plpgsql_append_source_text(&ds, startlocation, endlocation, yyscanner);

-    expr = palloc0(sizeof(PLpgSQL_expr));
-    expr->query = pstrdup(ds.data);
-    expr->parseMode = parsemode;
-    expr->plan = NULL;
-    expr->paramnos = NULL;
-    expr->target_param = -1;
-    expr->ns = plpgsql_ns_top();
+    expr = make_plpgsql_expr(ds.data, parsemode);
     pfree(ds.data);

     if (valid_sql)
@@ -3122,13 +3157,7 @@ make_execsql_stmt(int firsttoken, int location, PLword *word, YYSTYPE *yylvalp,
     while (ds.len > 0 && scanner_isspace(ds.data[ds.len - 1]))
         ds.data[--ds.len] = '\0';

-    expr = palloc0(sizeof(PLpgSQL_expr));
-    expr->query = pstrdup(ds.data);
-    expr->parseMode = RAW_PARSE_DEFAULT;
-    expr->plan = NULL;
-    expr->paramnos = NULL;
-    expr->target_param = -1;
-    expr->ns = plpgsql_ns_top();
+    expr = make_plpgsql_expr(ds.data, RAW_PARSE_DEFAULT);
     pfree(ds.data);

     check_sql_expr(expr->query, expr->parseMode, location, yyscanner);
@@ -4006,13 +4035,7 @@ read_cursor_args(PLpgSQL_var *cursor, int until, YYSTYPE *yylvalp, YYLTYPE *yyll
             appendStringInfoString(&ds, ", ");
     }

-    expr = palloc0(sizeof(PLpgSQL_expr));
-    expr->query = pstrdup(ds.data);
-    expr->parseMode = RAW_PARSE_PLPGSQL_EXPR;
-    expr->plan = NULL;
-    expr->paramnos = NULL;
-    expr->target_param = -1;
-    expr->ns = plpgsql_ns_top();
+    expr = make_plpgsql_expr(ds.data, RAW_PARSE_PLPGSQL_EXPR);
     pfree(ds.data);

     /* Next we'd better find the until token */
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index 441df5354e..b0052167ee 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -219,14 +219,22 @@ typedef struct PLpgSQL_expr
 {
     char       *query;            /* query string, verbatim from function body */
     RawParseMode parseMode;        /* raw_parser() mode to use */
-    SPIPlanPtr    plan;            /* plan, or NULL if not made yet */
-    Bitmapset  *paramnos;        /* all dnos referenced by this query */
+    struct PLpgSQL_function *func;    /* function containing this expr */
+    struct PLpgSQL_nsitem *ns;    /* namespace chain visible to this expr */

-    /* function containing this expr (not set until we first parse query) */
-    struct PLpgSQL_function *func;
+    /*
+     * These fields are used to help optimize assignments to expanded-datum
+     * variables.  If this expression is the source of an assignment to a
+     * simple variable, target_param holds that variable's dno (else it's -1).
+     */
+    int            target_param;    /* dno of assign target, or -1 if none */

-    /* namespace chain visible to this expr */
-    struct PLpgSQL_nsitem *ns;
+    /*
+     * Fields above are set during plpgsql parsing.  Remaining fields are left
+     * as zeroes/NULLs until we first parse/plan the query.
+     */
+    SPIPlanPtr    plan;            /* plan, or NULL if not made yet */
+    Bitmapset  *paramnos;        /* all dnos referenced by this query */

     /* fields for "simple expression" fast-path execution: */
     Expr       *expr_simple_expr;    /* NULL means not a simple expr */
@@ -235,14 +243,11 @@ typedef struct PLpgSQL_expr
     bool        expr_simple_mutable;    /* true if simple expr is mutable */

     /*
-     * These fields are used to optimize assignments to expanded-datum
-     * variables.  If this expression is the source of an assignment to a
-     * simple variable, target_param holds that variable's dno; else it's -1.
-     * If we match a Param within expr_simple_expr to such a variable, that
-     * Param's address is stored in expr_rw_param; then expression code
-     * generation will allow the value for that Param to be passed read/write.
+     * If we match a Param within expr_simple_expr to the variable identified
+     * by target_param, that Param's address is stored in expr_rw_param; then
+     * expression code generation will allow the value for that Param to be
+     * passed as a read/write expanded-object pointer.
      */
-    int            target_param;    /* dno of assign target, or -1 if none */
     Param       *expr_rw_param;    /* read/write Param within expr, if any */

     /*
--
2.43.5

From 6f39d4631c211109904627accd64da2e3102bef2 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 2 Feb 2025 16:40:31 -0500
Subject: [PATCH v5 3/5] Detect whether plpgsql assignment targets are "local"
 variables.

Mark whether the target of a potentially optimizable assignment
is "local", in the sense of being declared inside any exception
block that could trap an error thrown from the assignment.
(This implies that we needn't preserve the variable's value
in case of an error.  This patch doesn't do anything with the
knowledge, but the next one will.)

Normally, this requires a post-parsing scan of the function's
parse tree, since we don't know while parsing a BEGIN ...
construct whether we will find EXCEPTION at its end.  However,
if there are no BEGIN ... EXCEPTION blocks in the function at
all, then all assignments are local, even those to variables
representing function arguments.  We optimize that common case
by initializing the target_is_local flags to "true", and fixing
them up with a post-scan only if we found EXCEPTION.

Note that variables' default-value expressions are never interesting
for expanded-variable optimization, since they couldn't contain a
reference to the target variable anyway.  But the code is set up
to compute their target_param and target_is_local correctly anyway,
for consistency and in case someone thinks of a use for that data.

I added a bit of plpgsql_dumptree support to help verify that
this code sets the flags as expected.  I'm not set on keeping
that, but I do want to keep the addition of a plpgsql_dumptree
call in plpgsql_compile_inline.  It's at best an oversight that
"#option dump" doesn't work in a DO block.

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
---
 src/pl/plpgsql/src/pl_comp.c  | 12 +++++
 src/pl/plpgsql/src/pl_funcs.c | 88 +++++++++++++++++++++++++++++++++++
 src/pl/plpgsql/src/pl_gram.y  | 15 ++++++
 src/pl/plpgsql/src/plpgsql.h  |  7 ++-
 4 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index a2de0880fb..f36a244140 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -371,6 +371,7 @@ do_compile(FunctionCallInfo fcinfo,

     function->nstatements = 0;
     function->requires_procedure_resowner = false;
+    function->has_exception_block = false;

     /*
      * Initialize the compiler, particularly the namespace stack.  The
@@ -811,6 +812,9 @@ do_compile(FunctionCallInfo fcinfo,

     plpgsql_finish_datums(function);

+    if (function->has_exception_block)
+        plpgsql_mark_local_assignment_targets(function);
+
     /* Debug dump for completed functions */
     if (plpgsql_DumpExecTree)
         plpgsql_dumptree(function);
@@ -906,6 +910,7 @@ plpgsql_compile_inline(char *proc_source)

     function->nstatements = 0;
     function->requires_procedure_resowner = false;
+    function->has_exception_block = false;

     plpgsql_ns_init();
     plpgsql_ns_push(func_name, PLPGSQL_LABEL_BLOCK);
@@ -962,6 +967,13 @@ plpgsql_compile_inline(char *proc_source)

     plpgsql_finish_datums(function);

+    if (function->has_exception_block)
+        plpgsql_mark_local_assignment_targets(function);
+
+    /* Debug dump for completed functions */
+    if (plpgsql_DumpExecTree)
+        plpgsql_dumptree(function);
+
     /*
      * Pop the error context stack
      */
diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c
index 88e25b54bc..6b5394fc5f 100644
--- a/src/pl/plpgsql/src/pl_funcs.c
+++ b/src/pl/plpgsql/src/pl_funcs.c
@@ -598,6 +598,91 @@ plpgsql_statement_tree_walker_impl(PLpgSQL_stmt *stmt,
 }


+/**********************************************************************
+ * Mark assignment source expressions that have local target variables,
+ * that is, the target variable is declared within the exception block
+ * most closely containing the assignment itself.  (Such target variables
+ * need not be preserved if the assignment's source expression raises an
+ * error, since the variable will no longer be accessible afterwards.
+ * Detecting this allows better optimization.)
+ *
+ * This code need not be called if the plpgsql function contains no exception
+ * blocks, because mark_expr_as_assignment_source will have set all the flags
+ * to true already.  Also, we need not reconsider default-value expressions
+ * for variables, because variable declarations are necessarily within the
+ * nearest exception block.  (In DECLARE ... BEGIN ... EXCEPTION ... END, the
+ * variable initializations are done before entering the exception scope.)
+ *
+ * Within the recursion, local_dnos is a Bitmapset of dnos of variables
+ * known to be declared within the current exception level.
+ **********************************************************************/
+static void mark_stmt(PLpgSQL_stmt *stmt, Bitmapset *local_dnos);
+static void mark_expr(PLpgSQL_expr *expr, Bitmapset *local_dnos);
+
+static void
+mark_stmt(PLpgSQL_stmt *stmt, Bitmapset *local_dnos)
+{
+    if (stmt == NULL)
+        return;
+    if (stmt->cmd_type == PLPGSQL_STMT_BLOCK)
+    {
+        PLpgSQL_stmt_block *block = (PLpgSQL_stmt_block *) stmt;
+
+        if (block->exceptions)
+        {
+            /*
+             * The block creates a new exception scope, so variables declared
+             * at outer levels are nonlocal.  For that matter, so are any
+             * variables declared in the block's DECLARE section.  Hence, we
+             * must pass down empty local_dnos.
+             */
+            plpgsql_statement_tree_walker(stmt, mark_stmt, mark_expr, NULL);
+        }
+        else
+        {
+            /*
+             * Otherwise, the block does not create a new exception scope, and
+             * any variables it declares can also be considered local within
+             * it.  Note that only initializable datum types (VAR, REC) are
+             * included in initvarnos; but that's sufficient for our purposes.
+             */
+            local_dnos = bms_copy(local_dnos);
+            for (int i = 0; i < block->n_initvars; i++)
+                local_dnos = bms_add_member(local_dnos, block->initvarnos[i]);
+            plpgsql_statement_tree_walker(stmt, mark_stmt, mark_expr,
+                                          local_dnos);
+            bms_free(local_dnos);
+        }
+    }
+    else
+        plpgsql_statement_tree_walker(stmt, mark_stmt, mark_expr, local_dnos);
+}
+
+static void
+mark_expr(PLpgSQL_expr *expr, Bitmapset *local_dnos)
+{
+    /*
+     * If this expression has an assignment target, check whether the target
+     * is local, and mark the expression accordingly.
+     */
+    if (expr && expr->target_param >= 0)
+        expr->target_is_local = bms_is_member(expr->target_param, local_dnos);
+}
+
+void
+plpgsql_mark_local_assignment_targets(PLpgSQL_function *func)
+{
+    Bitmapset  *local_dnos;
+
+    /* Function parameters can be treated as local targets at outer level */
+    local_dnos = NULL;
+    for (int i = 0; i < func->fn_nargs; i++)
+        local_dnos = bms_add_member(local_dnos, func->fn_argvarnos[i]);
+    mark_stmt((PLpgSQL_stmt *) func->action, local_dnos);
+    bms_free(local_dnos);
+}
+
+
 /**********************************************************************
  * Release memory when a PL/pgSQL function is no longer needed
  *
@@ -1500,6 +1585,9 @@ static void
 dump_expr(PLpgSQL_expr *expr)
 {
     printf("'%s'", expr->query);
+    if (expr->target_param >= 0)
+        printf(" target %d%s", expr->target_param,
+               expr->target_is_local ? " (local)" : "");
 }

 void
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index f55aefb100..8048e040f8 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2328,6 +2328,8 @@ exception_sect    :
                         PLpgSQL_exception_block *new = palloc(sizeof(PLpgSQL_exception_block));
                         PLpgSQL_variable *var;

+                        plpgsql_curr_compile->has_exception_block = true;
+
                         var = plpgsql_build_variable("sqlstate", lineno,
                                                      plpgsql_build_datatype(TEXTOID,
                                                                             -1,
@@ -2673,6 +2675,7 @@ make_plpgsql_expr(const char *query,
     expr->ns = plpgsql_ns_top();
     /* might get changed later during parsing: */
     expr->target_param = -1;
+    expr->target_is_local = false;
     /* other fields are left as zeroes until first execution */
     return expr;
 }
@@ -2687,9 +2690,21 @@ mark_expr_as_assignment_source(PLpgSQL_expr *expr, PLpgSQL_datum *target)
      * other DTYPEs, so no need to mark in other cases.
      */
     if (target->dtype == PLPGSQL_DTYPE_VAR)
+    {
         expr->target_param = target->dno;
+
+        /*
+         * For now, assume the target is local to the nearest enclosing
+         * exception block.  That's correct if the function contains no
+         * exception blocks; otherwise we'll update this later.
+         */
+        expr->target_is_local = true;
+    }
     else
+    {
         expr->target_param = -1;    /* should be that already */
+        expr->target_is_local = false; /* ditto */
+    }
 }

 /* Convenience routine to read an expression with one possible terminator */
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index b0052167ee..2fa6d73cab 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -225,9 +225,12 @@ typedef struct PLpgSQL_expr
     /*
      * These fields are used to help optimize assignments to expanded-datum
      * variables.  If this expression is the source of an assignment to a
-     * simple variable, target_param holds that variable's dno (else it's -1).
+     * simple variable, target_param holds that variable's dno (else it's -1),
+     * and target_is_local indicates whether the target is declared inside the
+     * closest exception block containing the assignment.
      */
     int            target_param;    /* dno of assign target, or -1 if none */
+    bool        target_is_local;    /* is it within nearest exception block? */

     /*
      * Fields above are set during plpgsql parsing.  Remaining fields are left
@@ -1014,6 +1017,7 @@ typedef struct PLpgSQL_function
     /* data derived while parsing body */
     unsigned int nstatements;    /* counter for assigning stmtids */
     bool        requires_procedure_resowner;    /* contains CALL or DO? */
+    bool        has_exception_block;    /* contains BEGIN...EXCEPTION? */

     /* these fields change when the function is used */
     struct PLpgSQL_execstate *cur_estate;
@@ -1312,6 +1316,7 @@ extern PLpgSQL_nsitem *plpgsql_ns_find_nearest_loop(PLpgSQL_nsitem *ns_cur);
  */
 extern PGDLLEXPORT const char *plpgsql_stmt_typename(PLpgSQL_stmt *stmt);
 extern const char *plpgsql_getdiag_kindname(PLpgSQL_getdiag_kind kind);
+extern void plpgsql_mark_local_assignment_targets(PLpgSQL_function *func);
 extern void plpgsql_free_function_memory(PLpgSQL_function *func);
 extern void plpgsql_dumptree(PLpgSQL_function *func);

--
2.43.5

From eb5f318fb43e61c958a35802c214820f7ff7711d Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 2 Feb 2025 16:41:45 -0500
Subject: [PATCH v5 4/5] Implement new optimization rule for updates of
 expanded variables.

If a read/write expanded variable is declared locally to the
assignment statement that is updating it, and it is referenced
exactly once in the assignment RHS, then we can optimize the
operation as a direct update of the expanded value, whether
or not the function(s) operating on it can be trusted not to
modify the value before throwing an error.  This works because
if an error does get thrown, we no longer care what value the
variable has.

In cases where that doesn't work, fall back to the previous
rule that checks for safety of the top-level function.

In any case, postpone determination of whether these optimizations
are feasible until we are executing a Param referencing the target
variable and that variable holds a R/W expanded object.  While the
previous incarnation of exec_check_rw_parameter was pretty cheap,
this is a bit less so, and our plan to invoke support functions
will make it even less so.  So avoiding the check for variables
where it couldn't be useful should be a win.

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
---
 src/include/executor/execExpr.h               |   1 +
 src/pl/plpgsql/src/expected/plpgsql_array.out |   9 +
 src/pl/plpgsql/src/pl_exec.c                  | 383 +++++++++++++++---
 src/pl/plpgsql/src/plpgsql.h                  |  22 +-
 src/pl/plpgsql/src/sql/plpgsql_array.sql      |   9 +
 src/tools/pgindent/typedefs.list              |   2 +
 6 files changed, 364 insertions(+), 62 deletions(-)

diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 51bd35dcb0..191d8fe34d 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -425,6 +425,7 @@ typedef struct ExprEvalStep
         {
             ExecEvalSubroutine paramfunc;    /* add-on evaluation subroutine */
             void       *paramarg;    /* private data for same */
+            void       *paramarg2;    /* more private data for same */
             int            paramid;    /* numeric ID for parameter */
             Oid            paramtype;    /* OID of parameter's datatype */
         }            cparam;
diff --git a/src/pl/plpgsql/src/expected/plpgsql_array.out b/src/pl/plpgsql/src/expected/plpgsql_array.out
index ad60e0e8be..e5db6d6087 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_array.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_array.out
@@ -52,6 +52,15 @@ NOTICE:  a = ("{""(,11)""}",), a.c1[1].i = 11
 do $$ declare a int[];
 begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$;
 NOTICE:  a = {1,2,3}
+do $$ declare a int[] := array[1,2,3];
+begin
+  -- test scenarios for optimization of updates of R/W expanded objects
+  a := array_append(a, 42);  -- optimizable using "transfer" method
+  a := a || a[3];  -- optimizable using "inplace" method
+  a := a || a;     -- not optimizable
+  raise notice 'a = %', a;
+end$$;
+NOTICE:  a = {1,2,3,42,3,1,2,3,42,3}
 create temp table onecol as select array[1,2] as f1;
 do $$ declare a int[];
 begin a := f1 from onecol; raise notice 'a = %', a; end$$;
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index fec1811ae1..28b6c85d8d 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -251,6 +251,15 @@ static HTAB *shared_cast_hash = NULL;
     else \
         Assert(rc == PLPGSQL_RC_OK)

+/* State struct for count_param_references */
+typedef struct count_param_references_context
+{
+    int            paramid;
+    int            count;
+    Param       *last_param;
+} count_param_references_context;
+
+
 /************************************************************
  * Local function forward declarations
  ************************************************************/
@@ -336,7 +345,9 @@ static void exec_prepare_plan(PLpgSQL_execstate *estate,
 static void exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr);
 static bool exec_is_simple_query(PLpgSQL_expr *expr);
 static void exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan);
-static void exec_check_rw_parameter(PLpgSQL_expr *expr);
+static void exec_check_rw_parameter(PLpgSQL_expr *expr, int paramid);
+static bool count_param_references(Node *node,
+                                   count_param_references_context *context);
 static void exec_check_assignable(PLpgSQL_execstate *estate, int dno);
 static bool exec_eval_simple_expr(PLpgSQL_execstate *estate,
                                   PLpgSQL_expr *expr,
@@ -384,6 +395,10 @@ static ParamExternData *plpgsql_param_fetch(ParamListInfo params,
 static void plpgsql_param_compile(ParamListInfo params, Param *param,
                                   ExprState *state,
                                   Datum *resv, bool *resnull);
+static void plpgsql_param_eval_var_check(ExprState *state, ExprEvalStep *op,
+                                         ExprContext *econtext);
+static void plpgsql_param_eval_var_transfer(ExprState *state, ExprEvalStep *op,
+                                            ExprContext *econtext);
 static void plpgsql_param_eval_var(ExprState *state, ExprEvalStep *op,
                                    ExprContext *econtext);
 static void plpgsql_param_eval_var_ro(ExprState *state, ExprEvalStep *op,
@@ -6078,10 +6093,13 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,

         /*
          * Reset to "not simple" to leave sane state (with no dangling
-         * pointers) in case we fail while replanning.  expr_simple_plansource
-         * can be left alone however, as that cannot move.
+         * pointers) in case we fail while replanning.  We'll need to
+         * re-determine simplicity and R/W optimizability anyway, since those
+         * could change with the new plan.  expr_simple_plansource can be left
+         * alone however, as that cannot move.
          */
         expr->expr_simple_expr = NULL;
+        expr->expr_rwopt = PLPGSQL_RWOPT_UNKNOWN;
         expr->expr_rw_param = NULL;
         expr->expr_simple_plan = NULL;
         expr->expr_simple_plan_lxid = InvalidLocalTransactionId;
@@ -6439,16 +6457,27 @@ plpgsql_param_compile(ParamListInfo params, Param *param,
     scratch.resnull = resnull;

     /*
-     * Select appropriate eval function.  It seems worth special-casing
-     * DTYPE_VAR and DTYPE_RECFIELD for performance.  Also, we can determine
-     * in advance whether MakeExpandedObjectReadOnly() will be required.
-     * Currently, only VAR/PROMISE and REC datums could contain read/write
-     * expanded objects.
+     * Select appropriate eval function.
+     *
+     * First, if this Param references the same varlena-type DTYPE_VAR datum
+     * that is the target of the assignment containing this simple expression,
+     * then it's possible we will be able to optimize handling of R/W expanded
+     * datums.  We don't want to do the work needed to determine that unless
+     * we actually see a R/W expanded datum at runtime, so install a checking
+     * function that will figure that out when needed.
+     *
+     * Otherwise, it seems worth special-casing DTYPE_VAR and DTYPE_RECFIELD
+     * for performance.  Also, we can determine in advance whether
+     * MakeExpandedObjectReadOnly() will be required.  Currently, only
+     * VAR/PROMISE and REC datums could contain read/write expanded objects.
      */
     if (datum->dtype == PLPGSQL_DTYPE_VAR)
     {
-        if (param != expr->expr_rw_param &&
-            ((PLpgSQL_var *) datum)->datatype->typlen == -1)
+        bool        isvarlena = (((PLpgSQL_var *) datum)->datatype->typlen == -1);
+
+        if (isvarlena && dno == expr->target_param && expr->expr_simple_expr)
+            scratch.d.cparam.paramfunc = plpgsql_param_eval_var_check;
+        else if (isvarlena)
             scratch.d.cparam.paramfunc = plpgsql_param_eval_var_ro;
         else
             scratch.d.cparam.paramfunc = plpgsql_param_eval_var;
@@ -6457,14 +6486,12 @@ plpgsql_param_compile(ParamListInfo params, Param *param,
         scratch.d.cparam.paramfunc = plpgsql_param_eval_recfield;
     else if (datum->dtype == PLPGSQL_DTYPE_PROMISE)
     {
-        if (param != expr->expr_rw_param &&
-            ((PLpgSQL_var *) datum)->datatype->typlen == -1)
+        if (((PLpgSQL_var *) datum)->datatype->typlen == -1)
             scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro;
         else
             scratch.d.cparam.paramfunc = plpgsql_param_eval_generic;
     }
-    else if (datum->dtype == PLPGSQL_DTYPE_REC &&
-             param != expr->expr_rw_param)
+    else if (datum->dtype == PLPGSQL_DTYPE_REC)
         scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro;
     else
         scratch.d.cparam.paramfunc = plpgsql_param_eval_generic;
@@ -6473,14 +6500,177 @@ plpgsql_param_compile(ParamListInfo params, Param *param,
      * Note: it's tempting to use paramarg to store the estate pointer and
      * thereby save an indirection or two in the eval functions.  But that
      * doesn't work because the compiled expression might be used with
-     * different estates for the same PL/pgSQL function.
+     * different estates for the same PL/pgSQL function.  Instead, store
+     * pointers to the PLpgSQL_expr as well as this specific Param, to support
+     * plpgsql_param_eval_var_check().
      */
-    scratch.d.cparam.paramarg = NULL;
+    scratch.d.cparam.paramarg = expr;
+    scratch.d.cparam.paramarg2 = param;
     scratch.d.cparam.paramid = param->paramid;
     scratch.d.cparam.paramtype = param->paramtype;
     ExprEvalPushStep(state, &scratch);
 }

+/*
+ * plpgsql_param_eval_var_check        evaluation of EEOP_PARAM_CALLBACK step
+ *
+ * This is specialized to the case of DTYPE_VAR variables for which
+ * we may need to determine the applicability of a read/write optimization,
+ * but we've not done that yet.  The work to determine applicability will
+ * be done at most once (per construction of the PL/pgSQL function's cache
+ * entry) when we first see that the target variable's old value is a R/W
+ * expanded object.  If we never do see that, nothing is lost: the amount
+ * of work done by this function in that case is just about the same as
+ * what would be done by plpgsql_param_eval_var_ro, which is what we'd
+ * have used otherwise.
+ */
+static void
+plpgsql_param_eval_var_check(ExprState *state, ExprEvalStep *op,
+                             ExprContext *econtext)
+{
+    ParamListInfo params;
+    PLpgSQL_execstate *estate;
+    int            dno = op->d.cparam.paramid - 1;
+    PLpgSQL_var *var;
+
+    /* fetch back the hook data */
+    params = econtext->ecxt_param_list_info;
+    estate = (PLpgSQL_execstate *) params->paramFetchArg;
+    Assert(dno >= 0 && dno < estate->ndatums);
+
+    /* now we can access the target datum */
+    var = (PLpgSQL_var *) estate->datums[dno];
+    Assert(var->dtype == PLPGSQL_DTYPE_VAR);
+
+    /*
+     * If the variable's current value is a R/W expanded object, it's time to
+     * decide whether/how to optimize the assignment.
+     */
+    if (!var->isnull &&
+        VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value)))
+    {
+        PLpgSQL_expr *expr = (PLpgSQL_expr *) op->d.cparam.paramarg;
+        Param       *param = (Param *) op->d.cparam.paramarg2;
+
+        /*
+         * We might have already figured this out while evaluating some other
+         * Param referencing the same variable, so check expr_rwopt first.
+         */
+        if (expr->expr_rwopt == PLPGSQL_RWOPT_UNKNOWN)
+            exec_check_rw_parameter(expr, op->d.cparam.paramid);
+
+        /*
+         * Update the callback pointer to match what we decided to do, so that
+         * this function will not be called again.  Then pass off this
+         * execution to the newly-selected function.
+         */
+        switch (expr->expr_rwopt)
+        {
+            case PLPGSQL_RWOPT_UNKNOWN:
+                Assert(false);
+                break;
+            case PLPGSQL_RWOPT_NOPE:
+                /* Force the value to read-only in all future executions */
+                op->d.cparam.paramfunc = plpgsql_param_eval_var_ro;
+                plpgsql_param_eval_var_ro(state, op, econtext);
+                break;
+            case PLPGSQL_RWOPT_TRANSFER:
+                /* There can be only one matching Param in this case */
+                Assert(param == expr->expr_rw_param);
+                /* When the value is read/write, transfer to exec context */
+                op->d.cparam.paramfunc = plpgsql_param_eval_var_transfer;
+                plpgsql_param_eval_var_transfer(state, op, econtext);
+                break;
+            case PLPGSQL_RWOPT_INPLACE:
+                if (param == expr->expr_rw_param)
+                {
+                    /* When the value is read/write, deliver it as-is */
+                    op->d.cparam.paramfunc = plpgsql_param_eval_var;
+                    plpgsql_param_eval_var(state, op, econtext);
+                }
+                else
+                {
+                    /* Not the optimizable reference, so force to read-only */
+                    op->d.cparam.paramfunc = plpgsql_param_eval_var_ro;
+                    plpgsql_param_eval_var_ro(state, op, econtext);
+                }
+                break;
+        }
+        return;
+    }
+
+    /*
+     * Otherwise, continue to postpone that decision, and execute an inlined
+     * version of exec_eval_datum().  Although this value could potentially
+     * need MakeExpandedObjectReadOnly, we know it doesn't right now.
+     */
+    *op->resvalue = var->value;
+    *op->resnull = var->isnull;
+
+    /* safety check -- an assertion should be sufficient */
+    Assert(var->datatype->typoid == op->d.cparam.paramtype);
+}
+
+/*
+ * plpgsql_param_eval_var_transfer        evaluation of EEOP_PARAM_CALLBACK step
+ *
+ * This is specialized to the case of DTYPE_VAR variables for which
+ * we have determined that a read/write expanded value can be handed off
+ * into execution of the expression (and then possibly returned to our
+ * function's ownership afterwards).  We have to test though, because the
+ * variable might not contain a read/write expanded value during this
+ * execution.
+ */
+static void
+plpgsql_param_eval_var_transfer(ExprState *state, ExprEvalStep *op,
+                                ExprContext *econtext)
+{
+    ParamListInfo params;
+    PLpgSQL_execstate *estate;
+    int            dno = op->d.cparam.paramid - 1;
+    PLpgSQL_var *var;
+
+    /* fetch back the hook data */
+    params = econtext->ecxt_param_list_info;
+    estate = (PLpgSQL_execstate *) params->paramFetchArg;
+    Assert(dno >= 0 && dno < estate->ndatums);
+
+    /* now we can access the target datum */
+    var = (PLpgSQL_var *) estate->datums[dno];
+    Assert(var->dtype == PLPGSQL_DTYPE_VAR);
+
+    /*
+     * If the variable's current value is a R/W expanded object, transfer its
+     * ownership into the expression execution context, then drop our own
+     * reference to the value by setting the variable to NULL.  That'll be
+     * overwritten (perhaps with this same object) when control comes back
+     * from the expression.
+     */
+    if (!var->isnull &&
+        VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value)))
+    {
+        *op->resvalue = TransferExpandedObject(var->value,
+                                               get_eval_mcontext(estate));
+        *op->resnull = false;
+
+        var->value = (Datum) 0;
+        var->isnull = true;
+        var->freeval = false;
+    }
+    else
+    {
+        /*
+         * Otherwise we can pass the variable's value directly; we now know
+         * that MakeExpandedObjectReadOnly isn't needed.
+         */
+        *op->resvalue = var->value;
+        *op->resnull = var->isnull;
+    }
+
+    /* safety check -- an assertion should be sufficient */
+    Assert(var->datatype->typoid == op->d.cparam.paramtype);
+}
+
 /*
  * plpgsql_param_eval_var        evaluation of EEOP_PARAM_CALLBACK step
  *
@@ -7957,9 +8147,10 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
     MemoryContext oldcontext;

     /*
-     * Initialize to "not simple".
+     * Initialize to "not simple", and reset R/W optimizability.
      */
     expr->expr_simple_expr = NULL;
+    expr->expr_rwopt = PLPGSQL_RWOPT_UNKNOWN;
     expr->expr_rw_param = NULL;

     /*
@@ -8164,88 +8355,133 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
     expr->expr_simple_typmod = exprTypmod((Node *) tle_expr);
     /* We also want to remember if it is immutable or not */
     expr->expr_simple_mutable = contain_mutable_functions((Node *) tle_expr);
-
-    /*
-     * Lastly, check to see if there's a possibility of optimizing a
-     * read/write parameter.
-     */
-    exec_check_rw_parameter(expr);
 }

 /*
  * exec_check_rw_parameter --- can we pass expanded object as read/write param?
  *
- * If we have an assignment like "x := array_append(x, foo)" in which the
+ * There are two separate cases in which we can optimize an update to a
+ * variable that has a read/write expanded value by letting the called
+ * expression operate directly on the expanded value.  In both cases we
+ * are considering assignments like "var := array_append(var, foo)" where
+ * the assignment target is also an input to the RHS expression.
+ *
+ * Case 1 (RWOPT_TRANSFER rule): if the variable is "local" in the sense that
+ * its declaration is not outside any BEGIN...EXCEPTION block surrounding the
+ * assignment, then we do not need to worry about preserving its value if the
+ * RHS expression throws an error.  If in addition the variable is referenced
+ * exactly once in the RHS expression, then we can optimize by converting the
+ * read/write expanded value into a transient value within the expression
+ * evaluation context, and then setting the variable's recorded value to NULL
+ * to prevent double-free attempts.  This works regardless of any other
+ * details of the RHS expression.  If the expression eventually returns that
+ * same expanded object (possibly modified) then the variable will re-acquire
+ * ownership; while if it returns something else or throws an error, the
+ * expanded object will be discarded as part of cleanup of the evaluation
+ * context.
+ *
+ * Case 2 (RWOPT_INPLACE rule): if we have a non-local assignment or if
+ * it looks like "var := array_append(var, var[1])" with multiple references
+ * to the target variable, then we can't use case 1.  Nonetheless, if the
  * top-level function is trusted not to corrupt its argument in case of an
- * error, then when x has an expanded object as value, it is safe to pass the
- * value as a read/write pointer and let the function modify the value
- * in-place.
+ * error, then when the var has an expanded object as value, it is safe to
+ * pass the value as a read/write pointer to the top-level function and let
+ * the function modify the value in-place.  (Any other references have to be
+ * passed as read-only pointers as usual.)  Only the top-level function has to
+ * be trusted, since if anything further down fails, the object hasn't been
+ * modified yet.
  *
- * This function checks for a safe expression, and sets expr->expr_rw_param
- * to the address of any Param within the expression that can be passed as
- * read/write (there can be only one); or to NULL when there is no safe Param.
+ * This function checks to see if the assignment is optimizable according
+ * to either rule, and updates expr->expr_rwopt accordingly.  In addition,
+ * it sets expr->expr_rw_param to the address of the Param within the
+ * expression that can be passed as read/write (there can be only one);
+ * or to NULL when there is no safe Param.
  *
- * Note that this mechanism intentionally applies the safety labeling to just
- * one Param; the expression could contain other Params referencing the target
- * variable, but those must still be treated as read-only.
+ * Note that this mechanism intentionally allows just one Param to emit a
+ * read/write pointer; in case 2, the expression could contain other Params
+ * referencing the target variable, but those must be treated as read-only.
  *
  * Also note that we only apply this optimization within simple expressions.
  * There's no point in it for non-simple expressions, because the
  * exec_run_select code path will flatten any expanded result anyway.
- * Also, it's safe to assume that an expr_simple_expr tree won't get copied
- * somewhere before it gets compiled, so that looking for pointer equality
- * to expr_rw_param will work for matching the target Param.  That'd be much
- * shakier in the general case.
  */
 static void
-exec_check_rw_parameter(PLpgSQL_expr *expr)
+exec_check_rw_parameter(PLpgSQL_expr *expr, int paramid)
 {
-    int            target_dno;
+    Expr       *sexpr = expr->expr_simple_expr;
     Oid            funcid;
     List       *fargs;
     ListCell   *lc;

     /* Assume unsafe */
+    expr->expr_rwopt = PLPGSQL_RWOPT_NOPE;
     expr->expr_rw_param = NULL;

-    /* Done if expression isn't an assignment source */
-    target_dno = expr->target_param;
-    if (target_dno < 0)
-        return;
+    /* Shouldn't be here for non-simple expression */
+    Assert(sexpr != NULL);
+
+    /* Param should match the expression's assignment target, too */
+    Assert(paramid == expr->target_param + 1);

     /*
-     * If target variable isn't referenced by expression, no need to look
-     * further.
+     * If the assignment is to a "local" variable (one whose value won't
+     * matter anymore if expression evaluation fails), and this Param is the
+     * only reference to that variable in the expression, then we can
+     * unconditionally optimize using the "transfer" method.
      */
-    if (!bms_is_member(target_dno, expr->paramnos))
-        return;
+    if (expr->target_is_local)
+    {
+        count_param_references_context context;

-    /* Shouldn't be here for non-simple expression */
-    Assert(expr->expr_simple_expr != NULL);
+        /* See how many references there are, and find one of them */
+        context.paramid = paramid;
+        context.count = 0;
+        context.last_param = NULL;
+        (void) count_param_references((Node *) sexpr, &context);
+
+        /* If we're here, the expr must contain some reference to the var */
+        Assert(context.count > 0);
+
+        /* If exactly one reference, success! */
+        if (context.count == 1)
+        {
+            expr->expr_rwopt = PLPGSQL_RWOPT_TRANSFER;
+            expr->expr_rw_param = context.last_param;
+            return;
+        }
+    }

     /*
+     * Otherwise, see if we can trust the expression's top-level function to
+     * apply the "inplace" method.
+     *
      * Top level of expression must be a simple FuncExpr, OpExpr, or
-     * SubscriptingRef, else we can't optimize.
+     * SubscriptingRef, else we can't identify which function is relevant. But
+     * it's okay to look through any RelabelType above that, since that can't
+     * fail.
      */
-    if (IsA(expr->expr_simple_expr, FuncExpr))
+    if (IsA(sexpr, RelabelType))
+        sexpr = ((RelabelType *) sexpr)->arg;
+    if (IsA(sexpr, FuncExpr))
     {
-        FuncExpr   *fexpr = (FuncExpr *) expr->expr_simple_expr;
+        FuncExpr   *fexpr = (FuncExpr *) sexpr;

         funcid = fexpr->funcid;
         fargs = fexpr->args;
     }
-    else if (IsA(expr->expr_simple_expr, OpExpr))
+    else if (IsA(sexpr, OpExpr))
     {
-        OpExpr       *opexpr = (OpExpr *) expr->expr_simple_expr;
+        OpExpr       *opexpr = (OpExpr *) sexpr;

         funcid = opexpr->opfuncid;
         fargs = opexpr->args;
     }
-    else if (IsA(expr->expr_simple_expr, SubscriptingRef))
+    else if (IsA(sexpr, SubscriptingRef))
     {
-        SubscriptingRef *sbsref = (SubscriptingRef *) expr->expr_simple_expr;
+        SubscriptingRef *sbsref = (SubscriptingRef *) sexpr;

         /* We only trust standard varlena arrays to be safe */
+        /* TODO: install some extensibility here */
         if (get_typsubscript(sbsref->refcontainertype, NULL) !=
             F_ARRAY_SUBSCRIPT_HANDLER)
             return;
@@ -8256,9 +8492,10 @@ exec_check_rw_parameter(PLpgSQL_expr *expr)
             Param       *param = (Param *) sbsref->refexpr;

             if (param->paramkind == PARAM_EXTERN &&
-                param->paramid == target_dno + 1)
+                param->paramid == paramid)
             {
                 /* Found the Param we want to pass as read/write */
+                expr->expr_rwopt = PLPGSQL_RWOPT_INPLACE;
                 expr->expr_rw_param = param;
                 return;
             }
@@ -8293,9 +8530,10 @@ exec_check_rw_parameter(PLpgSQL_expr *expr)
             Param       *param = (Param *) arg;

             if (param->paramkind == PARAM_EXTERN &&
-                param->paramid == target_dno + 1)
+                param->paramid == paramid)
             {
                 /* Found the Param we want to pass as read/write */
+                expr->expr_rwopt = PLPGSQL_RWOPT_INPLACE;
                 expr->expr_rw_param = param;
                 return;
             }
@@ -8303,6 +8541,35 @@ exec_check_rw_parameter(PLpgSQL_expr *expr)
     }
 }

+/*
+ * Count Params referencing the specified paramid, and return one of them
+ * if there are any.
+ *
+ * We actually only need to distinguish 0, 1, and N references; so we can
+ * abort the tree traversal as soon as we've found two.
+ */
+static bool
+count_param_references(Node *node, count_param_references_context *context)
+{
+    if (node == NULL)
+        return false;
+    else if (IsA(node, Param))
+    {
+        Param       *param = (Param *) node;
+
+        if (param->paramkind == PARAM_EXTERN &&
+            param->paramid == context->paramid)
+        {
+            context->last_param = param;
+            if (++(context->count) > 1)
+                return true;    /* abort tree traversal */
+        }
+        return false;
+    }
+    else
+        return expression_tree_walker(node, count_param_references, context);
+}
+
 /*
  * exec_check_assignable --- is it OK to assign to the indicated datum?
  *
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index 2fa6d73cab..d73996e09c 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -187,6 +187,17 @@ typedef enum PLpgSQL_resolve_option
     PLPGSQL_RESOLVE_COLUMN,        /* prefer table column to plpgsql var */
 } PLpgSQL_resolve_option;

+/*
+ * Status of optimization of assignment to a read/write expanded object
+ */
+typedef enum PLpgSQL_rwopt
+{
+    PLPGSQL_RWOPT_UNKNOWN = 0,    /* applicability not determined yet */
+    PLPGSQL_RWOPT_NOPE,            /* cannot do any optimization */
+    PLPGSQL_RWOPT_TRANSFER,        /* transfer the old value into expr state */
+    PLPGSQL_RWOPT_INPLACE,        /* pass value as R/W to top-level function */
+} PLpgSQL_rwopt;
+

 /**********************************************************************
  * Node and structure definitions
@@ -246,11 +257,14 @@ typedef struct PLpgSQL_expr
     bool        expr_simple_mutable;    /* true if simple expr is mutable */

     /*
-     * If we match a Param within expr_simple_expr to the variable identified
-     * by target_param, that Param's address is stored in expr_rw_param; then
-     * expression code generation will allow the value for that Param to be
-     * passed as a read/write expanded-object pointer.
+     * expr_rwopt tracks whether we have determined that assignment to a
+     * read/write expanded object (stored in the target_param datum) can be
+     * optimized by passing it to the expr as a read/write expanded-object
+     * pointer.  If so, expr_rw_param identifies the specific Param that
+     * should emit a read/write pointer; any others will emit read-only
+     * pointers.
      */
+    PLpgSQL_rwopt expr_rwopt;    /* can we apply R/W optimization? */
     Param       *expr_rw_param;    /* read/write Param within expr, if any */

     /*
diff --git a/src/pl/plpgsql/src/sql/plpgsql_array.sql b/src/pl/plpgsql/src/sql/plpgsql_array.sql
index 4b9ff51594..4a346203dc 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_array.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_array.sql
@@ -48,6 +48,15 @@ begin a.c1[1].i := 11; raise notice 'a = %, a.c1[1].i = %', a, a.c1[1].i; end$$;
 do $$ declare a int[];
 begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$;

+do $$ declare a int[] := array[1,2,3];
+begin
+  -- test scenarios for optimization of updates of R/W expanded objects
+  a := array_append(a, 42);  -- optimizable using "transfer" method
+  a := a || a[3];  -- optimizable using "inplace" method
+  a := a || a;     -- not optimizable
+  raise notice 'a = %', a;
+end$$;
+
 create temp table onecol as select array[1,2] as f1;

 do $$ declare a int[];
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9a3bee93de..4d4bf62b6e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1873,6 +1873,7 @@ PLpgSQL_rec
 PLpgSQL_recfield
 PLpgSQL_resolve_option
 PLpgSQL_row
+PLpgSQL_rwopt
 PLpgSQL_stmt
 PLpgSQL_stmt_assert
 PLpgSQL_stmt_assign
@@ -3414,6 +3415,7 @@ core_yy_extra_type
 core_yyscan_t
 corrupt_items
 cost_qual_eval_context
+count_param_references_context
 cp_hash_func
 create_upper_paths_hook_type
 createdb_failure_params
--
2.43.5

From 02955d5e539888f0da1b92ce273f9ea049361d4a Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 2 Feb 2025 16:42:12 -0500
Subject: [PATCH v5 5/5] Allow extension functions to participate in in-place
 updates.

Commit 1dc5ebc90 allowed PL/pgSQL to perform in-place updates
of expanded-object variables that are being updated with
assignments like "x := f(x, ...)".  However this was allowed
only for a hard-wired list of functions f(), since we need to
be sure that f() will not modify the variable if it fails.
It was always envisioned that we should make that extensible,
but at the time we didn't have a good way to do so.  Since
then we've invented the idea of "support functions" to allow
attaching specialized optimization knowledge to functions,
and that is a perfect mechanism for doing this.

Hence, adjust PL/pgSQL to use a support function request instead
of hard-wired logic to decide if in-place update is safe.
Preserve the previous optimizations by creating support functions
for the three functions that were previously hard-wired.

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
---
 src/backend/utils/adt/array_userfuncs.c       | 61 +++++++++++++
 src/backend/utils/adt/arraysubs.c             | 34 ++++++++
 src/include/catalog/pg_proc.dat               | 20 +++--
 src/include/nodes/supportnodes.h              | 55 +++++++++++-
 src/pl/plpgsql/src/expected/plpgsql_array.out |  3 +-
 src/pl/plpgsql/src/pl_exec.c                  | 86 ++++++++-----------
 src/pl/plpgsql/src/sql/plpgsql_array.sql      |  1 +
 src/tools/pgindent/typedefs.list              |  1 +
 8 files changed, 202 insertions(+), 59 deletions(-)

diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 0b02fe3744..2aae2f8ed9 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -16,6 +16,7 @@
 #include "common/int.h"
 #include "common/pg_prng.h"
 #include "libpq/pqformat.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
@@ -167,6 +168,36 @@ array_append(PG_FUNCTION_ARGS)
     PG_RETURN_DATUM(result);
 }

+/*
+ * array_append_support()
+ *
+ * Planner support function for array_append()
+ */
+Datum
+array_append_support(PG_FUNCTION_ARGS)
+{
+    Node       *rawreq = (Node *) PG_GETARG_POINTER(0);
+    Node       *ret = NULL;
+
+    if (IsA(rawreq, SupportRequestModifyInPlace))
+    {
+        /*
+         * We can optimize in-place appends if the function's array argument
+         * is the array being assigned to.  We don't need to worry about array
+         * references within the other argument.
+         */
+        SupportRequestModifyInPlace *req = (SupportRequestModifyInPlace *) rawreq;
+        Param       *arg = (Param *) linitial(req->args);
+
+        if (arg && IsA(arg, Param) &&
+            arg->paramkind == PARAM_EXTERN &&
+            arg->paramid == req->paramid)
+            ret = (Node *) arg;
+    }
+
+    PG_RETURN_POINTER(ret);
+}
+
 /*-----------------------------------------------------------------------------
  * array_prepend :
  *        push an element onto the front of a one-dimensional array
@@ -230,6 +261,36 @@ array_prepend(PG_FUNCTION_ARGS)
     PG_RETURN_DATUM(result);
 }

+/*
+ * array_prepend_support()
+ *
+ * Planner support function for array_prepend()
+ */
+Datum
+array_prepend_support(PG_FUNCTION_ARGS)
+{
+    Node       *rawreq = (Node *) PG_GETARG_POINTER(0);
+    Node       *ret = NULL;
+
+    if (IsA(rawreq, SupportRequestModifyInPlace))
+    {
+        /*
+         * We can optimize in-place prepends if the function's array argument
+         * is the array being assigned to.  We don't need to worry about array
+         * references within the other argument.
+         */
+        SupportRequestModifyInPlace *req = (SupportRequestModifyInPlace *) rawreq;
+        Param       *arg = (Param *) lsecond(req->args);
+
+        if (arg && IsA(arg, Param) &&
+            arg->paramkind == PARAM_EXTERN &&
+            arg->paramid == req->paramid)
+            ret = (Node *) arg;
+    }
+
+    PG_RETURN_POINTER(ret);
+}
+
 /*-----------------------------------------------------------------------------
  * array_cat :
  *        concatenate two nD arrays to form an nD array, or
diff --git a/src/backend/utils/adt/arraysubs.c b/src/backend/utils/adt/arraysubs.c
index 562179b379..2940fb8e8d 100644
--- a/src/backend/utils/adt/arraysubs.c
+++ b/src/backend/utils/adt/arraysubs.c
@@ -18,6 +18,7 @@
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/subscripting.h"
+#include "nodes/supportnodes.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_expr.h"
 #include "utils/array.h"
@@ -575,3 +576,36 @@ raw_array_subscript_handler(PG_FUNCTION_ARGS)

     PG_RETURN_POINTER(&sbsroutines);
 }
+
+/*
+ * array_subscript_handler_support()
+ *
+ * Planner support function for array_subscript_handler()
+ */
+Datum
+array_subscript_handler_support(PG_FUNCTION_ARGS)
+{
+    Node       *rawreq = (Node *) PG_GETARG_POINTER(0);
+    Node       *ret = NULL;
+
+    if (IsA(rawreq, SupportRequestModifyInPlace))
+    {
+        /*
+         * We can optimize in-place subscripted assignment if the refexpr is
+         * the array being assigned to.  We don't need to worry about array
+         * references within the refassgnexpr or the subscripts; however, if
+         * there's no refassgnexpr then it's a fetch which there's no need to
+         * optimize.
+         */
+        SupportRequestModifyInPlace *req = (SupportRequestModifyInPlace *) rawreq;
+        Param       *refexpr = (Param *) linitial(req->args);
+
+        if (refexpr && IsA(refexpr, Param) &&
+            refexpr->paramkind == PARAM_EXTERN &&
+            refexpr->paramid == req->paramid &&
+            lsecond(req->args) != NULL)
+            ret = (Node *) refexpr;
+    }
+
+    PG_RETURN_POINTER(ret);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5b8c2ad2a5..9e803d610d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1598,14 +1598,20 @@
   proname => 'cardinality', prorettype => 'int4', proargtypes => 'anyarray',
   prosrc => 'array_cardinality' },
 { oid => '378', descr => 'append element onto end of array',
-  proname => 'array_append', proisstrict => 'f',
-  prorettype => 'anycompatiblearray',
+  proname => 'array_append', prosupport => 'array_append_support',
+  proisstrict => 'f', prorettype => 'anycompatiblearray',
   proargtypes => 'anycompatiblearray anycompatible', prosrc => 'array_append' },
+{ oid => '8680', descr => 'planner support for array_append',
+  proname => 'array_append_support', prorettype => 'internal',
+  proargtypes => 'internal', prosrc => 'array_append_support' },
 { oid => '379', descr => 'prepend element onto front of array',
-  proname => 'array_prepend', proisstrict => 'f',
-  prorettype => 'anycompatiblearray',
+  proname => 'array_prepend', prosupport => 'array_prepend_support',
+  proisstrict => 'f', prorettype => 'anycompatiblearray',
   proargtypes => 'anycompatible anycompatiblearray',
   prosrc => 'array_prepend' },
+{ oid => '8681', descr => 'planner support for array_prepend',
+  proname => 'array_prepend_support', prorettype => 'internal',
+  proargtypes => 'internal', prosrc => 'array_prepend_support' },
 { oid => '383',
   proname => 'array_cat', proisstrict => 'f',
   prorettype => 'anycompatiblearray',
@@ -12207,8 +12213,12 @@

 # subscripting support for built-in types
 { oid => '6179', descr => 'standard array subscripting support',
-  proname => 'array_subscript_handler', prorettype => 'internal',
+  proname => 'array_subscript_handler',
+  prosupport => 'array_subscript_handler_support', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'array_subscript_handler' },
+{ oid => '8682', descr => 'planner support for array_subscript_handler',
+  proname => 'array_subscript_handler_support', prorettype => 'internal',
+  proargtypes => 'internal', prosrc => 'array_subscript_handler_support' },
 { oid => '6180', descr => 'raw array subscripting support',
   proname => 'raw_array_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'raw_array_subscript_handler' },
diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h
index ad5d43a2a7..9c047cc401 100644
--- a/src/include/nodes/supportnodes.h
+++ b/src/include/nodes/supportnodes.h
@@ -6,10 +6,10 @@
  * This file defines the API for "planner support functions", which
  * are SQL functions (normally written in C) that can be attached to
  * another "target" function to give the system additional knowledge
- * about the target function.  All the current capabilities have to do
- * with planning queries that use the target function, though it is
- * possible that future extensions will add functionality to be invoked
- * by the parser or executor.
+ * about the target function.  The name is now something of a misnomer,
+ * since some of the call sites are in the executor not the planner,
+ * but "function support function" would be a confusing name so we
+ * stick with "planner support function".
  *
  * A support function must have the SQL signature
  *        supportfn(internal) returns internal
@@ -343,4 +343,51 @@ typedef struct SupportRequestOptimizeWindowClause
                                  * optimizations are possible. */
 } SupportRequestOptimizeWindowClause;

+/*
+ * The ModifyInPlace request allows the support function to detect whether
+ * a call to its target function can be allowed to modify a read/write
+ * expanded object in-place.  The context is that we are considering a
+ * PL/pgSQL (or similar PL) assignment of the form "x := f(x, ...)" where
+ * the variable x is of a type that can be represented as an expanded object
+ * (see utils/expandeddatum.h).  If f() can usefully optimize by modifying
+ * the passed-in object in-place, then this request can be implemented to
+ * instruct PL/pgSQL to pass a read-write expanded pointer to the variable's
+ * value.  (Note that there is no guarantee that later calls to f() will
+ * actually do so.  If f() receives a read-only pointer, or a pointer to a
+ * non-expanded object, it must follow the usual convention of not modifying
+ * the pointed-to object.)  There are two requirements that must be met
+ * to make this safe:
+ * 1. f() must guarantee that it will not have modified the object if it
+ * fails.  Otherwise the variable's value might change unexpectedly.
+ * 2. If the other arguments to f() ("..." in the above example) contain
+ * references to x, f() must be able to cope with that; or if that's not
+ * safe, the support function must scan the other arguments to verify that
+ * there are no other references to x.  An example of the concern here is
+ * that in "arr := array_append(arr, arr[1])", if the array element type
+ * is pass-by-reference then array_append would receive a second argument
+ * that points into the array object it intends to modify.  array_append is
+ * coded to make that safe, but other functions might not be able to cope.
+ *
+ * "args" is a node tree list representing the function's arguments.
+ * One or more nodes within the node tree will be PARAM_EXTERN Params
+ * with ID "paramid", which represent the assignment target variable.
+ * (Note that such references are not necessarily at top level in the list,
+ * for example we might have "x := f(x, g(x))".  Generally it's only safe
+ * to optimize a reference that is at top level, else we're making promises
+ * about the behavior of g() as well as f().)
+ *
+ * If modify-in-place is safe, the support function should return the
+ * address of the Param node that is to return a read-write pointer.
+ * (At most one of the references is allowed to do so.)  Otherwise,
+ * return NULL.
+ */
+typedef struct SupportRequestModifyInPlace
+{
+    NodeTag        type;
+
+    Oid            funcid;            /* PG_PROC OID of the target function */
+    List       *args;            /* Arguments to the function */
+    int            paramid;        /* ID of Param(s) representing variable */
+} SupportRequestModifyInPlace;
+
 #endif                            /* SUPPORTNODES_H */
diff --git a/src/pl/plpgsql/src/expected/plpgsql_array.out b/src/pl/plpgsql/src/expected/plpgsql_array.out
index e5db6d6087..4c6b3ce998 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_array.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_array.out
@@ -57,10 +57,11 @@ begin
   -- test scenarios for optimization of updates of R/W expanded objects
   a := array_append(a, 42);  -- optimizable using "transfer" method
   a := a || a[3];  -- optimizable using "inplace" method
+  a := a[1] || a;  -- ditto, but let's test array_prepend
   a := a || a;     -- not optimizable
   raise notice 'a = %', a;
 end$$;
-NOTICE:  a = {1,2,3,42,3,1,2,3,42,3}
+NOTICE:  a = {1,1,2,3,42,3,1,1,2,3,42,3}
 create temp table onecol as select array[1,2] as f1;
 do $$ declare a int[];
 begin a := f1 from onecol; raise notice 'a = %', a; end$$;
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 28b6c85d8d..d4377ceecb 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -29,6 +29,7 @@
 #include "mb/stringinfo_mb.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_type.h"
@@ -8411,7 +8412,7 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int paramid)
     Expr       *sexpr = expr->expr_simple_expr;
     Oid            funcid;
     List       *fargs;
-    ListCell   *lc;
+    Oid            prosupport;

     /* Assume unsafe */
     expr->expr_rwopt = PLPGSQL_RWOPT_NOPE;
@@ -8480,64 +8481,51 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int paramid)
     {
         SubscriptingRef *sbsref = (SubscriptingRef *) sexpr;

-        /* We only trust standard varlena arrays to be safe */
-        /* TODO: install some extensibility here */
-        if (get_typsubscript(sbsref->refcontainertype, NULL) !=
-            F_ARRAY_SUBSCRIPT_HANDLER)
-            return;
-
-        /* We can optimize the refexpr if it's the target, otherwise not */
-        if (sbsref->refexpr && IsA(sbsref->refexpr, Param))
-        {
-            Param       *param = (Param *) sbsref->refexpr;
+        funcid = get_typsubscript(sbsref->refcontainertype, NULL);

-            if (param->paramkind == PARAM_EXTERN &&
-                param->paramid == paramid)
-            {
-                /* Found the Param we want to pass as read/write */
-                expr->expr_rwopt = PLPGSQL_RWOPT_INPLACE;
-                expr->expr_rw_param = param;
-                return;
-            }
-        }
-
-        return;
+        /*
+         * We assume that only the refexpr and refassgnexpr (if any) are
+         * relevant to the support function's decision.  If that turns out to
+         * be a bad idea, we could incorporate the subscript expressions into
+         * the fargs list somehow.
+         */
+        fargs = list_make2(sbsref->refexpr, sbsref->refassgnexpr);
     }
     else
         return;

     /*
-     * The top-level function must be one that we trust to be "safe".
-     * Currently we hard-wire the list, but it would be very desirable to
-     * allow extensions to mark their functions as safe ...
+     * The top-level function must be one that can handle in-place update
+     * safely.  We allow functions to declare their ability to do that via a
+     * support function request.
      */
-    if (!(funcid == F_ARRAY_APPEND ||
-          funcid == F_ARRAY_PREPEND))
-        return;
-
-    /*
-     * The target variable (in the form of a Param) must appear as a direct
-     * argument of the top-level function.  References further down in the
-     * tree can't be optimized; but on the other hand, they don't invalidate
-     * optimizing the top-level call, since that will be executed last.
-     */
-    foreach(lc, fargs)
+    prosupport = get_func_support(funcid);
+    if (OidIsValid(prosupport))
     {
-        Node       *arg = (Node *) lfirst(lc);
+        SupportRequestModifyInPlace req;
+        Param       *param;

-        if (arg && IsA(arg, Param))
-        {
-            Param       *param = (Param *) arg;
+        req.type = T_SupportRequestModifyInPlace;
+        req.funcid = funcid;
+        req.args = fargs;
+        req.paramid = paramid;

-            if (param->paramkind == PARAM_EXTERN &&
-                param->paramid == paramid)
-            {
-                /* Found the Param we want to pass as read/write */
-                expr->expr_rwopt = PLPGSQL_RWOPT_INPLACE;
-                expr->expr_rw_param = param;
-                return;
-            }
-        }
+        param = (Param *)
+            DatumGetPointer(OidFunctionCall1(prosupport,
+                                             PointerGetDatum(&req)));
+
+        if (param == NULL)
+            return;                /* support function fails */
+
+        /* Verify support function followed the API */
+        Assert(IsA(param, Param));
+        Assert(param->paramkind == PARAM_EXTERN);
+        Assert(param->paramid == paramid);
+
+        /* Found the Param we want to pass as read/write */
+        expr->expr_rwopt = PLPGSQL_RWOPT_INPLACE;
+        expr->expr_rw_param = param;
+        return;
     }
 }

diff --git a/src/pl/plpgsql/src/sql/plpgsql_array.sql b/src/pl/plpgsql/src/sql/plpgsql_array.sql
index 4a346203dc..da984a9941 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_array.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_array.sql
@@ -53,6 +53,7 @@ begin
   -- test scenarios for optimization of updates of R/W expanded objects
   a := array_append(a, 42);  -- optimizable using "transfer" method
   a := a || a[3];  -- optimizable using "inplace" method
+  a := a[1] || a;  -- ditto, but let's test array_prepend
   a := a || a;     -- not optimizable
   raise notice 'a = %', a;
 end$$;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4d4bf62b6e..62c63e3728 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2804,6 +2804,7 @@ SubscriptionRelState
 SummarizerReadLocalXLogPrivate
 SupportRequestCost
 SupportRequestIndexCondition
+SupportRequestModifyInPlace
 SupportRequestOptimizeWindowClause
 SupportRequestRows
 SupportRequestSelectivity
--
2.43.5


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Vacuum statistics
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Fix incorrect range in pg_regress comment