Re: Making view dump/restore safe at the column-alias level - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Making view dump/restore safe at the column-alias level
Date
Msg-id 15151.1356144377@sss.pgh.pa.us
Whole thread Raw
In response to Re: Making view dump/restore safe at the column-alias level  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Making view dump/restore safe at the column-alias level
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> I'm having a hard time following this.  Can you provide a concrete example?

regression=# create table t1 (x int, y int);
CREATE TABLE
regression=# create table t2 (x int, z int);
CREATE TABLE
regression=# create view v1 as select * from t1 join t2 using (x);
CREATE VIEW
regression=# \d+ v1                  View "public.v1"Column |  Type   | Modifiers | Storage | Description 
--------+---------+-----------+---------+-------------x      | integer |           | plain   | y      | integer |
   | plain   | z      | integer |           | plain   | 
 
View definition:SELECT t1.x, t1.y, t2.z  FROM t1  JOIN t2 USING (x);
regression=# alter table t2 rename column x to q;
ALTER TABLE
regression=# \d+ v1                  View "public.v1"Column |  Type   | Modifiers | Storage | Description 
--------+---------+-----------+---------+-------------x      | integer |           | plain   | y      | integer |
   | plain   | z      | integer |           | plain   | 
 
View definition:SELECT t1.x, t1.y, t2.z  FROM t1  JOIN t2 USING (x);

At this point the dumped view definition is wrong: if you try to execute
it you get

regression=# SELECT t1.x, t1.y, t2.z
regression-#    FROM t1
regression-#    JOIN t2 USING (x);
ERROR:  column "x" specified in USING clause does not exist in right table

I'm suggesting that we could fix this by emitting something that forces
the right alias to be assigned to t2.q:

SELECT t1.x, t1.y, t2.z  FROM t1  JOIN t2 AS t2(x,z)  USING (x);

The implementation I have in mind is to recurse down the join tree and
have any JOIN USING item forcibly propagate the common column name as
the alias-to-use for each of the two input columns.

Also consider

regression=# create view v2 as select * from (select 1,2) as a(x,y)
regression-# union select * from (select 3,4) as b;
CREATE VIEW
regression=# \d+ v2                  View "public.v2"Column |  Type   | Modifiers | Storage | Description 
--------+---------+-----------+---------+-------------x      | integer |           | plain   | y      | integer |
   | plain   | 
 
View definition:        SELECT a.x, a.y          FROM ( SELECT 1, 2) a(x, y)
UNION         SELECT b."?column?" AS x, b."?column?" AS y          FROM ( SELECT 3, 4) b;

That view definition doesn't work either, as complained of today in
pgsql-general.  To fix this we just need to force the columns of b
to be given distinct aliases.  The minimum-new-code solution would
probably be to produce
        SELECT a.x, a.y          FROM ( SELECT 1, 2) a(x, y)
UNION         SELECT b."?column?" AS x, b."?column?_1" AS y          FROM ( SELECT 3, 4) b("?column?", "?column?_1")

using the same add-some-digits-until-unique logic we are using for
relation aliases.  This could be done by considering all the column
aliases of each RTE when we arrive at it during the recursive scan.

On further reflection I think my worry about the top-level aliases
was unfounded --- we prevent views from being created at all unless
the top-level column names are all distinct.   But we definitely
have got issues for lower-level aliases, as these examples show.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Commits 8de72b and 5457a1 (COPY FREEZE)
Next
From: Tom Lane
Date:
Subject: Re: pgcrypto seeding problem when ssl=on