Thread: A problem with dump/restore of views containing whole row references

A problem with dump/restore of views containing whole row references

From
Abbas Butt
Date:
Hi,

This is the version I used to run the following commands

select version();
                                                    version                                                    
----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.2devel on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 64-bit
(1 row)

Run these commands

  CREATE TABLE price (id INT PRIMARY KEY, active BOOLEAN NOT NULL, price NUMERIC);
  insert into price values (1,false,42), (10,false,100), (11,true,17.99);
  create view v2 as select price.*::price from price;
  select * from v2;
      price    
  --------------
   (1,f,42)
   (10,f,100)
   (11,t,17.99)
  (3 rows)

  \d+ v2;
                    View "public.v2"
   Column | Type  | Modifiers | Storage  | Description
  --------+-------+-----------+----------+-------------
   price  | price |           | extended |
  View definition:
   SELECT price AS price
     FROM price;

Note the output from the view, also note the "Type" in view defination.

Now take dump of this database.

./pg_dump --file=/home/user_name/d.sql --format=p --inserts -p 4444 test

The dump file is attached with the mail. (d.sql)

Now lets restore this dump.

./createdb test2 -p 4444
./psql -p 4444 -f /home/user_name/d.sql test2
./psql test2 -p 4444
psql (9.2devel)
Type "help" for help.

test2=# select * from v2;
 price
-------
    42
   100
 17.99
(3 rows)

test2=# \d+ v2
                   View "public.v2"
 Column |  Type   | Modifiers | Storage | Description
--------+---------+-----------+---------+-------------
 price  | numeric |           | main    |
View definition:
 SELECT price.price
   FROM price;

In the database test2 the view was not restored correctly.
The output of the view as well as the Type in its defination is wrong.

The cause of the problem is as follows

The notation "relation.*" represents a whole-row reference.
While parsing a whole-row reference is transformed into a Var with varno set to the correct range table entry,
and varattno == 0 to signal that it references the whole tuple. (For reference see comments of function makeWholeRowVar)
While deparsing we need to take care of this case.
The attached patch provides deparsing of a whole-row reference.
A whole row reference will be deparsed either into alias.*::relation or relation.*::relation depending on alias

--
Abbas
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment

Re: A problem with dump/restore of views containing whole row references

From
Andrew Dunstan
Date:

On 04/27/2012 08:25 AM, Abbas Butt wrote:
>
> The notation "relation.*" represents a whole-row reference.
> While parsing a whole-row reference is transformed into a Var with 
> varno set to the correct range table entry,
> and varattno == 0 to signal that it references the whole tuple. (For 
> reference see comments of function makeWholeRowVar)
> While deparsing we need to take care of this case.
> The attached patch provides deparsing of a whole-row reference.
> A whole row reference will be deparsed either into alias.*::relation 
> or relation.*::relation depending on alias
>

I agree there's a bug, although it's easily worked around: in the case 
of your example:
   CREATE VIEW v2 AS        SELECT p AS price FROM price p;

would do the trick.

However, is this a change we really want to make?:
   pg_get_triggerdef
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 - CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.* IS DISTINCT FROM new.*)
EXECUTEPROCEDURE trigger_func('modified_any')   +
          pg_get_triggerdef
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 + CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.*::main_table IS DISTINCT FROM
new.*::main_table)EXECUTE PROCEDURE trigger_func('modified_any')
 


Maybe we need to be a bit more selective about when the cast is 
supplied. It's not adding any extra disambiguation (or clarity) here.

cheers

andrew



Re: A problem with dump/restore of views containing whole row references

From
Abbas Butt
Date:


On Fri, Apr 27, 2012 at 6:25 PM, Andrew Dunstan <andrew@dunslane.net> wrote:


On 04/27/2012 08:25 AM, Abbas Butt wrote:

The notation "relation.*" represents a whole-row reference.
While parsing a whole-row reference is transformed into a Var with varno set to the correct range table entry,
and varattno == 0 to signal that it references the whole tuple. (For reference see comments of function makeWholeRowVar)
While deparsing we need to take care of this case.
The attached patch provides deparsing of a whole-row reference.
A whole row reference will be deparsed either into alias.*::relation or relation.*::relation depending on alias


I agree there's a bug, although it's easily worked around: in the case of your example:

  CREATE VIEW v2 AS
       SELECT p AS price FROM price p;

would do the trick.

However, is this a change we really want to make?:

  pg_get_triggerdef
  ---------------------------------------------------------------------------------------------------------------------------------------------------------------
  - CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.* IS DISTINCT FROM new.*) EXECUTE PROCEDURE trigger_func('modified_any')
  +                                                                                  pg_get_triggerdef
  +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  + CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.*::main_table IS DISTINCT FROM new.*::main_table) EXECUTE PROCEDURE trigger_func('modified_any')


Maybe we need to be a bit more selective about when the cast is supplied. It's not adding any extra disambiguation (or clarity) here.

I ran the regression and found that my patch is causing a diff in the trigger test case, thats why I changed the expected output of the test case accordingly. This is a side effect of the change I did to fix the bug.
 

cheers

andrew




--
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: A problem with dump/restore of views containing whole row references

From
Andrew Dunstan
Date:

On 04/27/2012 12:02 PM, Abbas Butt wrote:
>
>
>
>     However, is this a change we really want to make?:
>
>       pg_get_triggerdef
>      
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------
>       - CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table
>     FOR EACH ROW WHEN (old.* IS DISTINCT FROM new.*) EXECUTE PROCEDURE
>     trigger_func('modified_any')
>       +                                                              
>                        pg_get_triggerdef
>      
>
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>       + CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table
>     FOR EACH ROW WHEN (old.*::main_table IS DISTINCT FROM
>     new.*::main_table) EXECUTE PROCEDURE trigger_func('modified_any')
>
>
>     Maybe we need to be a bit more selective about when the cast is
>     supplied. It's not adding any extra disambiguation (or clarity) here.
>
>
>
> I ran the regression and found that my patch is causing a diff in the 
> trigger test case, thats why I changed the expected output of the test 
> case accordingly. This is a side effect of the change I did to fix the 
> bug.
>

Right, what I'm asking is whether or not we actually want that side 
effect in all cases, and specifically in this case where it's clearly 
not necessary.

cheers

andrew


Andrew Dunstan <andrew@dunslane.net> writes:
> Right, what I'm asking is whether or not we actually want that side 
> effect in all cases, and specifically in this case where it's clearly 
> not necessary.

We could dodge that case by only changing the behavior when showstar is
false; there is no need to change it otherwise.  The patch has assorted
other bugs too, in particular its schema-name treatment seems completely
wrong (hint: RelationIsVisible is not the same as TypeIsVisible, and
it's at best shaky to assume that a relation's name is the same as its
rowtype's name anyway).

More generally, it seems rather inelegant to be forcibly adding a cast
when in most cases the existing notation is not wrong.  AFAICS the
plain "relname" notation is only ambiguous if there is a column of the
same name as the relation.  I wonder whether we should instead address
this by not letting the parser strip the "no op" cast in the first
place.
        regards, tom lane


Re: A problem with dump/restore of views containing whole row references

From
Abbas Butt
Date:


On Fri, Apr 27, 2012 at 11:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
> Right, what I'm asking is whether or not we actually want that side
> effect in all cases, and specifically in this case where it's clearly
> not necessary.

We could dodge that case by only changing the behavior when showstar is
false; there is no need to change it otherwise.  The patch has assorted
other bugs too, in particular its schema-name treatment seems completely
wrong (hint: RelationIsVisible is not the same as TypeIsVisible, and
it's at best shaky to assume that a relation's name is the same as its
rowtype's name anyway).

More generally, it seems rather inelegant to be forcibly adding a cast
when in most cases the existing notation is not wrong.  AFAICS the
plain "relname" notation is only ambiguous if there is a column of the
same name as the relation.  I wonder whether we should instead address
this by not letting the parser strip the "no op" cast in the first
place.

You mean that the parser should not strip the "no op" cast in all cases or in the case only when the parser somehow detects a column of the same name as the relation?
 

                       regards, tom lane
 
--
Abbas
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Abbas Butt <abbas.butt@enterprisedb.com> writes:
> On Fri, Apr 27, 2012 at 11:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> More generally, it seems rather inelegant to be forcibly adding a cast
>> when in most cases the existing notation is not wrong.  AFAICS the
>> plain "relname" notation is only ambiguous if there is a column of the
>> same name as the relation.  I wonder whether we should instead address
>> this by not letting the parser strip the "no op" cast in the first
>> place.

> You mean that the parser should not strip the "no op" cast in all cases or
> in the case only when the parser somehow detects a column of the same name
> as the relation?

On reflection that's the wrong thing anyway.  While (AFAICS) one could
only initially create this type of situation by using an explicit cast
as in your example, the ambiguity could be introduced after the fact by
a rename or ALTER TABLE ADD COLUMN, which wouldn't even have to affect
the troublesome table itself --- a column matching the table's name
anywhere in the FROM clause would create the same ambiguity.  So there's
no guarantee that there ever was a cast there.

So I think that your patch is the right approach, if wrong in detail.
What we have to do is stop using the ambiguous table-name-only syntax,
and instead always print tabname.*, and then add a cast to prevent
expansion of the "*" if we are at top level of a SELECT targetlist.

Attached is a patch that I think does this correctly.  I renamed the
flag parameter (and flipped its sense) since it is no longer controlling
whether or not a "*" gets printed.  One thing I like about this is that
whole-row Vars are no longer ever special in terms of naming; looking
at the code with a fresh eye, I wonder whether we didn't have other bugs
here in cases such as where a schema qualification is needed.  Omitting
the star is just asking for trouble ...

            regards, tom lane

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 3beed37dd26d6fc51a4365b3f4086b97f71cc16c..7ad99a0ec32579760f6e10278ec8585a0b2ab855 100644
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*************** static void get_rule_orderby(List *order
*** 209,215 ****
  static void get_rule_windowclause(Query *query, deparse_context *context);
  static void get_rule_windowspec(WindowClause *wc, List *targetList,
                      deparse_context *context);
! static char *get_variable(Var *var, int levelsup, bool showstar,
               deparse_context *context);
  static RangeTblEntry *find_rte_by_refname(const char *refname,
                      deparse_context *context);
--- 209,215 ----
  static void get_rule_windowclause(Query *query, deparse_context *context);
  static void get_rule_windowspec(WindowClause *wc, List *targetList,
                      deparse_context *context);
! static char *get_variable(Var *var, int levelsup, bool istoplevel,
               deparse_context *context);
  static RangeTblEntry *find_rte_by_refname(const char *refname,
                      deparse_context *context);
*************** get_target_list(List *targetList, depars
*** 3074,3084 ****
           * "foo.*", which is the preferred notation in most contexts, but at
           * the top level of a SELECT list it's not right (the parser will
           * expand that notation into multiple columns, yielding behavior
!          * different from a whole-row Var).  We want just "foo", instead.
           */
          if (tle->expr && IsA(tle->expr, Var))
          {
!             attname = get_variable((Var *) tle->expr, 0, false, context);
          }
          else
          {
--- 3074,3085 ----
           * "foo.*", which is the preferred notation in most contexts, but at
           * the top level of a SELECT list it's not right (the parser will
           * expand that notation into multiple columns, yielding behavior
!          * different from a whole-row Var).  We need to call get_variable
!          * directly so that we can tell it to do the right thing.
           */
          if (tle->expr && IsA(tle->expr, Var))
          {
!             attname = get_variable((Var *) tle->expr, 0, true, context);
          }
          else
          {
*************** get_utility_query_def(Query *query, depa
*** 3803,3815 ****
   * the Var's varlevelsup has to be interpreted with respect to a context
   * above the current one; levelsup indicates the offset.
   *
!  * If showstar is TRUE, whole-row Vars are displayed as "foo.*";
!  * if FALSE, merely as "foo".
   *
!  * Returns the attname of the Var, or NULL if not determinable.
   */
  static char *
! get_variable(Var *var, int levelsup, bool showstar, deparse_context *context)
  {
      StringInfo    buf = context->buf;
      RangeTblEntry *rte;
--- 3804,3823 ----
   * the Var's varlevelsup has to be interpreted with respect to a context
   * above the current one; levelsup indicates the offset.
   *
!  * If istoplevel is TRUE, the Var is at the top level of a SELECT's
!  * targetlist, which means we need special treatment of whole-row Vars.
!  * Instead of the normal "tab.*", we'll print "tab.*::typename", which is a
!  * dirty hack to prevent "tab.*" from being expanded into multiple columns.
!  * (The parser will strip the useless coercion, so no inefficiency is added in
!  * dump and reload.)  We used to print just "tab" in such cases, but that is
!  * ambiguous and will yield the wrong result if "tab" is also a plain column
!  * name in the query.
   *
!  * Returns the attname of the Var, or NULL if the Var has no attname (because
!  * it is a whole-row Var).
   */
  static char *
! get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
  {
      StringInfo    buf = context->buf;
      RangeTblEntry *rte;
*************** get_variable(Var *var, int levelsup, boo
*** 3998,4007 ****
                  if (IsA(aliasvar, Var))
                  {
                      return get_variable(aliasvar, var->varlevelsup + levelsup,
!                                         showstar, context);
                  }
              }
!             /* Unnamed join has neither schemaname nor refname */
              refname = NULL;
          }
      }
--- 4006,4021 ----
                  if (IsA(aliasvar, Var))
                  {
                      return get_variable(aliasvar, var->varlevelsup + levelsup,
!                                         istoplevel, context);
                  }
              }
!
!             /*
!              * Unnamed join has neither schemaname nor refname.  (Note: since
!              * it's unnamed, there is no way the user could have referenced it
!              * to create a whole-row Var for it.  So we don't have to cover
!              * that case below.)
!              */
              refname = NULL;
          }
      }
*************** get_variable(Var *var, int levelsup, boo
*** 4017,4029 ****
              appendStringInfo(buf, "%s.",
                               quote_identifier(schemaname));
          appendStringInfoString(buf, quote_identifier(refname));
!         if (attname || showstar)
!             appendStringInfoChar(buf, '.');
      }
      if (attname)
          appendStringInfoString(buf, quote_identifier(attname));
!     else if (showstar)
          appendStringInfoChar(buf, '*');

      return attname;
  }
--- 4031,4048 ----
              appendStringInfo(buf, "%s.",
                               quote_identifier(schemaname));
          appendStringInfoString(buf, quote_identifier(refname));
!         appendStringInfoChar(buf, '.');
      }
      if (attname)
          appendStringInfoString(buf, quote_identifier(attname));
!     else
!     {
          appendStringInfoChar(buf, '*');
+         if (istoplevel)
+             appendStringInfo(buf, "::%s",
+                              format_type_with_typemod(var->vartype,
+                                                       var->vartypmod));
+     }

      return attname;
  }
*************** get_rule_expr(Node *node, deparse_contex
*** 4974,4980 ****
      switch (nodeTag(node))
      {
          case T_Var:
!             (void) get_variable((Var *) node, 0, true, context);
              break;

          case T_Const:
--- 4993,4999 ----
      switch (nodeTag(node))
      {
          case T_Var:
!             (void) get_variable((Var *) node, 0, false, context);
              break;

          case T_Const: