Re: If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered - Mailing list pgsql-bugs

From Tom Lane
Subject Re: If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered
Date
Msg-id 1983818.1658350464@sss.pgh.pa.us
Whole thread Raw
In response to If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered  (Timo Stolz <timo.stolz@nullachtvierzehn.de>)
Responses Re: If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-bugs
Timo Stolz <timo.stolz@nullachtvierzehn.de> writes:
> *Summary: If a row-level security policy contains a set returning 
> function, pg_dump returns an incorrect serialization of that policy if 
> the return type of the function was altered after the policy was created.*

This has nothing particularly to do with RLS policies; you can
reproduce the same problem with any stored query that selects from a
function-returning-composite, for instance a view.

We even have a regression test case illustrating the current behavior :-(.
I guess nobody thought hard about the implications for dump-and-restore,
but I agree that they're bad.

> Instead of this, I would expect a serialization without an aliased FROM 
> clause, because that's how I wrote these policies in the first place.

You might expect that, but you won't get it.  As noted in ruleutils.c,

     * For a relation RTE, we need only print the alias column names if any
     * are different from the underlying "real" names.  For a function RTE,
     * always emit a complete column alias list; this is to protect against
     * possible instability of the default column names (eg, from altering
     * parameter names).  For tablefunc RTEs, we never print aliases, ...

What we have to do here is to suppress the aliases for any since-dropped
columns, while keeping the live ones.  That's slightly finicky, but there
is existing code that can get the job done.  ruleutils just wasn't
considering the possibility that function RTEs might have this problem.

The larger issue that this touches on is that we don't prevent you from
dropping the composite type's column even when the query using the
dependent function has hard references to that column (e.g, it's actually
output by the view).  Maybe sometime somebody ought to work on tightening
that up.  In the meantime though, it's bad for EXPLAIN or pg_dump to fail
altogether on such cases, so I propose the behavior shown in the attached
patch.

(Even if somebody does add the necessary dependencies later, we'd still
need to cope with the situation in released branches, which might already
have broken views to cope with.)

            regards, tom lane

diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 4e635561aa..8d832efc62 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3198,6 +3198,9 @@ expandNSItemAttrs(ParseState *pstate, ParseNamespaceItem *nsitem,
  *
  * "*" is returned if the given attnum is InvalidAttrNumber --- this case
  * occurs when a Var represents a whole tuple of a relation.
+ *
+ * It is caller's responsibility to not call this on a dropped attribute.
+ * (You will get some answer for such cases, but it might not be sensible.)
  */
 char *
 get_rte_attribute_name(RangeTblEntry *rte, AttrNumber attnum)
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 513bf0f026..ac75704363 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -53,6 +53,7 @@
 #include "parser/parse_func.h"
 #include "parser/parse_node.h"
 #include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
 #include "parser/parser.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteHandler.h"
@@ -4323,9 +4324,9 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
     int            j;

     /*
-     * Extract the RTE's "real" column names.  This is comparable to
-     * get_rte_attribute_name, except that it's important to disregard dropped
-     * columns.  We put NULL into the array for a dropped column.
+     * Construct an array of the current "real" column names of the RTE.
+     * real_colnames[] will be indexed by physical column number, with NULL
+     * entries for dropped columns.
      */
     if (rte->rtekind == RTE_RELATION)
     {
@@ -4352,19 +4353,42 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
     }
     else
     {
-        /* Otherwise use the column names from eref */
+        /* Otherwise get the column names from eref or expandRTE() */
+        List       *colnames;
         ListCell   *lc;

-        ncolumns = list_length(rte->eref->colnames);
+        /*
+         * Functions returning composites have the annoying property that some
+         * of the composite type's columns might have been dropped since the
+         * query was parsed.  If possible, use expandRTE() to handle that
+         * case, since it has the tedious logic needed to find out about
+         * dropped columns.  However, if we're explaining a plan, then we
+         * don't have rte->functions because the planner thinks that won't be
+         * needed later, and that breaks expandRTE().  So in that case we have
+         * to rely on rte->eref, which may lead us to report a dropped
+         * column's old name; that seems close enough for EXPLAIN's purposes.
+         *
+         * For non-RELATION, non-FUNCTION RTEs, we can just look at rte->eref,
+         * which should be sufficiently up-to-date.
+         */
+        if (rte->rtekind == RTE_FUNCTION && rte->functions != NIL)
+        {
+            /* Since we're not creating Vars, rtindex etc. don't matter */
+            expandRTE(rte, 1, 0, -1, true /* include dropped */ ,
+                      &colnames, NULL);
+        }
+        else
+            colnames = rte->eref->colnames;
+
+        ncolumns = list_length(colnames);
         real_colnames = (char **) palloc(ncolumns * sizeof(char *));

         i = 0;
-        foreach(lc, rte->eref->colnames)
+        foreach(lc, colnames)
         {
             /*
-             * If the column name shown in eref is an empty string, then it's
-             * a column that was dropped at the time of parsing the query, so
-             * treat it as dropped.
+             * If the column name we find here is an empty string, then it's a
+             * dropped column, so change to NULL.
              */
             char       *cname = strVal(lfirst(lc));

@@ -7301,9 +7325,16 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
             elog(ERROR, "invalid attnum %d for relation \"%s\"",
                  attnum, rte->eref->aliasname);
         attname = colinfo->colnames[attnum - 1];
-        if (attname == NULL)    /* dropped column? */
-            elog(ERROR, "invalid attnum %d for relation \"%s\"",
-                 attnum, rte->eref->aliasname);
+
+        /*
+         * If we find a Var referencing a dropped column, it seems better to
+         * print something (anything) than to fail.  In general this should
+         * not happen, but there are specific cases involving functions
+         * returning named composite types where we don't sufficiently enforce
+         * that you can't drop a column that's referenced in some view.
+         */
+        if (attname == NULL)
+            attname = "?dropped?column?";
     }
     else
     {
diff --git a/src/test/regress/expected/create_view.out b/src/test/regress/expected/create_view.out
index 9d4f9011a8..6ee49249fc 100644
--- a/src/test/regress/expected/create_view.out
+++ b/src/test/regress/expected/create_view.out
@@ -1600,17 +1600,26 @@ select * from tt14v;
 begin;
 -- this perhaps should be rejected, but it isn't:
 alter table tt14t drop column f3;
--- f3 is still in the view ...
+-- column f3 is still in the view, sort of ...
 select pg_get_viewdef('tt14v', true);
-         pg_get_viewdef
---------------------------------
-  SELECT t.f1,                 +
-     t.f3,                     +
-     t.f4                      +
-    FROM tt14f() t(f1, f3, f4);
+         pg_get_viewdef
+---------------------------------
+  SELECT t.f1,                  +
+     t."?dropped?column?" AS f3,+
+     t.f4                       +
+    FROM tt14f() t(f1, f4);
 (1 row)

--- but will fail at execution
+-- ... and you can even EXPLAIN it ...
+explain (verbose, costs off) select * from tt14v;
+               QUERY PLAN
+----------------------------------------
+ Function Scan on testviewschm2.tt14f t
+   Output: t.f1, t.f3, t.f4
+   Function Call: tt14f()
+(3 rows)
+
+-- but it will fail at execution
 select f1, f4 from tt14v;
  f1  | f4
 -----+----
diff --git a/src/test/regress/sql/create_view.sql b/src/test/regress/sql/create_view.sql
index 50acfe96e6..949b116625 100644
--- a/src/test/regress/sql/create_view.sql
+++ b/src/test/regress/sql/create_view.sql
@@ -580,9 +580,11 @@ begin;
 -- this perhaps should be rejected, but it isn't:
 alter table tt14t drop column f3;

--- f3 is still in the view ...
+-- column f3 is still in the view, sort of ...
 select pg_get_viewdef('tt14v', true);
--- but will fail at execution
+-- ... and you can even EXPLAIN it ...
+explain (verbose, costs off) select * from tt14v;
+-- but it will fail at execution
 select f1, f4 from tt14v;
 select * from tt14v;


pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Trusted extension cannot be dropped by the owner of the extension
Next
From: Dean Rasheed
Date:
Subject: Re: If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered