Thread: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
"Sergey Burladyan"
Date:
The following bug has been logged online: Bug reference: 5234 Logged by: Sergey Burladyan Email address: eshkinkot@gmail.com PostgreSQL version: 8.3.8 Operating system: Debian GNU/Linux 5.0.3 (lenny) + testing Description: ALTER TABLE ... RENAME COLUMN change view definition incorrectly Details: reported by Weed at http://www.sql.ru/forum/actualthread.aspx?tid=717835 (Russian) Example: create table a (i int, v text); create table b (j int, v text); create view v_using as select * from a left join b using (v); alter table a rename v to o; \d v_using CREATE TABLE CREATE TABLE CREATE VIEW ALTER TABLE View "public.v_using" Column | Type | Modifiers --------+---------+----------- v | text | i | integer | j | integer | View definition: SELECT a.o AS v, a.i, b.j FROM a LEFT JOIN b USING (v); View is still working, but it text definition is incorrect: t1=> select * from v_using ; v | i | j ---+---+--- (0 rows) t1=> SELECT a.o AS v, a.i, b.j t1-> FROM a t1-> LEFT JOIN b USING (v); ERROR: 42703: column "v" specified in USING clause does not exist in left table LOCATION: transformFromClauseItem, parse_clause.c:813 If you dump database in this state, when you cannot restore this dump without manual fix: $ pg_dump -Fc -f dump t1 $ pg_restore dump | grep -A2 VIEW -- Name: v_using; Type: VIEW; Schema: public; Owner: seb -- CREATE VIEW v_using AS SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v)); $ LANG=C sudo -u postgres pg_restore -c -d t1 dump . . . pg_restore: [archiver (db)] could not execute query: ERROR: column "v" specified in USING clause does not exist in left table Command was: CREATE VIEW v_using AS SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v)); . . .
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Sun, Dec 6, 2009 at 5:41 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote: > > The following bug has been logged online: > > Bug reference: 5234 > Logged by: Sergey Burladyan > Email address: eshkinkot@gmail.com > PostgreSQL version: 8.3.8 > Operating system: Debian GNU/Linux 5.0.3 (lenny) + testing > Description: ALTER TABLE ... RENAME COLUMN change view definition > incorrectly > Details: > > reported by Weed at http://www.sql.ru/forum/actualthread.aspx?tid=717835 > (Russian) > > Example: > create table a (i int, v text); > create table b (j int, v text); > create view v_using as select * from a left join b using (v); > alter table a rename v to o; > \d v_using > > CREATE TABLE > CREATE TABLE > CREATE VIEW > ALTER TABLE > View "public.v_using" > Column | Type | Modifiers > --------+---------+----------- > v | text | > i | integer | > j | integer | > View definition: > SELECT a.o AS v, a.i, b.j > FROM a > LEFT JOIN b USING (v); > > View is still working, but it text definition is incorrect: > t1=> select * from v_using ; > v | i | j > ---+---+--- > (0 rows) > > t1=> SELECT a.o AS v, a.i, b.j > t1-> FROM a > t1-> LEFT JOIN b USING (v); > ERROR: 42703: column "v" specified in USING clause does not exist in left > table > LOCATION: transformFromClauseItem, parse_clause.c:813 > > If you dump database in this state, when you cannot restore this dump > without manual fix: > $ pg_dump -Fc -f dump t1 > $ pg_restore dump | grep -A2 VIEW > -- Name: v_using; Type: VIEW; Schema: public; Owner: seb > -- > > CREATE VIEW v_using AS > SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v)); > > $ LANG=C sudo -u postgres pg_restore -c -d t1 dump > . . . > pg_restore: [archiver (db)] could not execute query: ERROR: column "v" > specified in USING clause does not exist in left table > Command was: CREATE VIEW v_using AS > SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v)); The problem here seems to be that get_from_clause_item() is a bit naive about the JoinExpr representation. usingClause is stored as a text argument, but get_from_clause_item() is assuming that the schema definitions have not changed since the parse tree was created. isNatural has the same problem. I'm not an expert on this area of the code, but can we just ignore isNatural and usingClause when deparsing? The attached patch fixes the bug for me, although I couldn't swear as to whether it breaks anything else. ...Robert
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > I'm not an expert on this area of the code, but can we just ignore > isNatural and usingClause when deparsing? No. These properties are *not* ignorable because doing so changes the set of returned columns --- you should get only one column out not two. My reading of the spec is that USING (and therefore NATURAL) is defined to join identically named columns. Therefore, renaming one of the input columns as the OP did *should* indeed *must* break the view. The problem is not how to make it work, it's how to give an error message that doesn't look like an internal failure. (This wasn't one of the SQL committee's better ideas IMO, but ours not to reason why.) regards, tom lane
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I'm not an expert on this area of the code, but can we just ignore >> isNatural and usingClause when deparsing? > > No. =A0These properties are *not* ignorable because doing so changes the > set of returned columns --- you should get only one column out not two. > > My reading of the spec is that USING (and therefore NATURAL) is defined > to join identically named columns. =A0Therefore, renaming one of the input > columns as the OP did *should* indeed *must* break the view. =A0The probl= em > is not how to make it work, it's how to give an error message that > doesn't look like an internal failure. That seems ugly and unnecessary. I think we might be able to define ourselves out of this problem. We don't guarantee (and have never guaranteed) that selecting from a stored view will produce the same results as re-executing the original query. For example, * refers the list of columns at definition-time, not execution-time, and if a column is renamed, the view still refers to the same column; it doesn't start crashing, nor would we want it to. Similarly, here, the USING is internally converted to an equality join on the two columns, and the ambiguous output column is, I think, resolved in favor of one of them. I think we can just say that that conversion happens in toto at parse-time, just as the *-to-column-list conversion and the column-name-to-column-reference conversions do. This seems like a significantly more useful behavior and as a fringe benefit it simplifies the code. Moreover, it's basically what we're already doing for so long as the view remains safely stored in the database; it just becomes undumpable. If we adopt your solution, backpatching it might break working applications and not backpatching it will leave residual breakage when people try to upgrade. If we just make the dumping code work the same way the executor already does, we don't have that problem. ...Robert
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Andrew Gierth
Date:
>>>>> "Robert" =3D=3D Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>=20 >> My reading of the spec is that USING (and therefore NATURAL) is >> defined to join identically named columns. =A0Therefore, renaming >> one of the input columns as the OP did *should* indeed *must* >> break the view. =A0The problem is not how to make it work, it's how >> to give an error message that doesn't look like an internal >> failure. Robert> That seems ugly and unnecessary. I think we might be able to Robert> define ourselves out of this problem. We don't guarantee Robert> (and have never guaranteed) that selecting from a stored view Robert> will produce the same results as re-executing the original Robert> query. For example, * refers the list of columns at Robert> definition-time, not execution-time, and if a column is Robert> renamed, the view still refers to the same column; it doesn't Robert> start crashing, nor would we want it to. Similarly, here, Robert> the USING is internally converted to an equality join on the Robert> two columns, and the ambiguous output column is, I think, Robert> resolved in favor of one of them. I think we can just say Robert> that that conversion happens in toto at parse-time, just as Robert> the *-to-column-list conversion and the Robert> column-name-to-column-reference conversions do. This seems Robert> like a significantly more useful behavior and as a fringe Robert> benefit it simplifies the code. There's another possible solution (albeit a somewhat nontrivial one) which came up when a bunch of us were talking about this one on IRC; which is to handle the problem in the view deparse: if a column used in a USING clause has been renamed, add an alias to the query that renames it back, e.g. select ... from table1 as table1(v,a) join ... using (v) This would have to affect all the other references to that same column in the query, so you'd need to do something like this: before deparsing, walk the query looking for offending USING clauses, and make a list of renamings to apply to column names. I haven't tried actually implementing this, but I believe it is possible. --=20 Andrew (irc:RhodiumToad)
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 10:48 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: >>>>>> "Robert" =3D=3D Robert Haas <robertmhaas@gmail.com> writes: > > =A0> On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > =A0>> > =A0>> My reading of the spec is that USING (and therefore NATURAL) is > =A0>> defined to join identically named columns. =A0Therefore, renaming > =A0>> one of the input columns as the OP did *should* indeed *must* > =A0>> break the view. =A0The problem is not how to make it work, it's how > =A0>> to give an error message that doesn't look like an internal > =A0>> failure. > > =A0Robert> That seems ugly and unnecessary. =A0I think we might be able to > =A0Robert> define ourselves out of this problem. =A0We don't guarantee > =A0Robert> (and have never guaranteed) that selecting from a stored view > =A0Robert> will produce the same results as re-executing the original > =A0Robert> query. =A0For example, * refers the list of columns at > =A0Robert> definition-time, not execution-time, and if a column is > =A0Robert> renamed, the view still refers to the same column; it doesn't > =A0Robert> start crashing, nor would we want it to. =A0Similarly, here, > =A0Robert> the USING is internally converted to an equality join on the > =A0Robert> two columns, and the ambiguous output column is, I think, > =A0Robert> resolved in favor of one of them. =A0I think we can just say > =A0Robert> that that conversion happens in toto at parse-time, just as > =A0Robert> the *-to-column-list conversion and the > =A0Robert> column-name-to-column-reference conversions do. =A0This seems > =A0Robert> like a significantly more useful behavior and as a fringe > =A0Robert> benefit it simplifies the code. > > There's another possible solution (albeit a somewhat nontrivial one) > which came up when a bunch of us were talking about this one on IRC; > which is to handle the problem in the view deparse: if a column used > in a USING clause has been renamed, add an alias to the query that > renames it back, e.g. > =A0select ... from table1 as table1(v,a) join ... using (v) > > This would have to affect all the other references to that same column > in the query, so you'd need to do something like this: before deparsing, > walk the query looking for offending USING clauses, and make a list of > renamings to apply to column names. > > I haven't tried actually implementing this, but I believe it is > possible. What advantage does this offer? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> My reading of the spec is that USING (and therefore NATURAL) is defined >> to join identically named columns. Therefore, renaming one of the input >> columns as the OP did *should* indeed *must* break the view. The problem >> is not how to make it work, it's how to give an error message that >> doesn't look like an internal failure. > That seems ugly and unnecessary. I think we might be able to define > ourselves out of this problem. We don't guarantee (and have never > guaranteed) that selecting from a stored view will produce the same > results as re-executing the original query. For example, * refers the > list of columns at definition-time, not execution-time, Um, aren't you contradicting yourself there? The problem with USING is that it is not merely a join condition but affects the set of columns emitted by the join. It can't be converted to a simple ON without changing the semantics, and I don't believe we should try. regards, tom lane
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > There's another possible solution (albeit a somewhat nontrivial one) > which came up when a bunch of us were talking about this one on IRC; > which is to handle the problem in the view deparse: if a column used > in a USING clause has been renamed, add an alias to the query that > renames it back, e.g. > select ... from table1 as table1(v,a) join ... using (v) Hmm. Cute, but I wonder why we shouldn't just be throwing an error. As I said last night, the only thing I see wrong with the current behavior is that the error message isn't phrased in a way that makes it obvious it's the user's fault. regards, tom lane
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> There's another possible solution (albeit a somewhat nontrivial one) >> which came up when a bunch of us were talking about this one on IRC; >> which is to handle the problem in the view deparse: if a column used >> in a USING clause has been renamed, add an alias to the query that >> renames it back, e.g. >> =A0 select ... from table1 as table1(v,a) join ... using (v) > > Hmm. =A0Cute, but I wonder why we shouldn't just be throwing an error. > As I said last night, the only thing I see wrong with the current > behavior is that the error message isn't phrased in a way that makes > it obvious it's the user's fault. I think the problem we're trying to solve is that the view works but it isn't dumpable. ...Robert
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My reading of the spec is that USING (and therefore NATURAL) is defined >>> to join identically named columns. =A0Therefore, renaming one of the in= put >>> columns as the OP did *should* indeed *must* break the view. =A0The pro= blem >>> is not how to make it work, it's how to give an error message that >>> doesn't look like an internal failure. > >> That seems ugly and unnecessary. =A0I think we might be able to define >> ourselves out of this problem. =A0We don't guarantee (and have never >> guaranteed) that selecting from a stored view will produce the same >> results as re-executing the original query. =A0For example, * refers the >> list of columns at definition-time, not execution-time, > > Um, aren't you contradicting yourself there? Not that I can see. > The problem with USING is that it is not merely a join condition but > affects the set of columns emitted by the join. =A0It can't be converted > to a simple ON without changing the semantics, and I don't believe we > should try. The OP seemed to be pretty clear on what the semantics should be, and I'm not confused either. Why are you? :-) ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The problem with USING is that it is not merely a join condition but >> affects the set of columns emitted by the join. It can't be converted >> to a simple ON without changing the semantics, and I don't believe we >> should try. > The OP seemed to be pretty clear on what the semantics should be, and > I'm not confused either. Why are you? :-) I'm not confused: it's an error condition. You have not explained why it isn't. What would actually be ideal is to throw an error on the attempted column rename, but I doubt the problem is worth the additions to system infrastructure that would be necessary to make that happen. regards, tom lane
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Alvaro Herrera
Date:
Robert Haas escribió: > On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Hmm. Cute, but I wonder why we shouldn't just be throwing an error. > > As I said last night, the only thing I see wrong with the current > > behavior is that the error message isn't phrased in a way that makes > > it obvious it's the user's fault. > > I think the problem we're trying to solve is that the view works but > it isn't dumpable. So it would also be a solution that the ALTER TABLE that renamed the column throwed an error if there's a view that requires the name. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 1:02 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Robert Haas escribi=F3: >> On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> > Hmm. =A0Cute, but I wonder why we shouldn't just be throwing an error. >> > As I said last night, the only thing I see wrong with the current >> > behavior is that the error message isn't phrased in a way that makes >> > it obvious it's the user's fault. >> >> I think the problem we're trying to solve is that the view works but >> it isn't dumpable. > > So it would also be a solution that the ALTER TABLE that renamed the > column throwed an error if there's a view that requires the name. Yes. I don't think it's as good of a solution, but it is a solution. ...Robert
Re: BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
From
Robert Haas
Date:
On Thu, Dec 10, 2009 at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> The problem with USING is that it is not merely a join condition but >>> affects the set of columns emitted by the join. =A0It can't be converted >>> to a simple ON without changing the semantics, and I don't believe we >>> should try. > >> The OP seemed to be pretty clear on what the semantics should be, and >> I'm not confused either. =A0Why are you? =A0:-) > > I'm not confused: it's an error condition. =A0You have not explained why > it isn't. Well, it's obviously not an error condition at present, because the view works perfectly well. As far as running SELECT statements against the view, the QUERY structure in ev_rewrite is semantically equivalent to one without the usingClause. buildMergedJoinVar() builds a target list entry that refers to one or both of the underlying base tables, using COALESCE if necessary. Nothing in that join var depends IN ANY WAY on the usingClause. The JoinExpr similarly gets rewritten in terms of an equality construct - while it's true that the usingClause is still present, nothing outside the parser/deparser actually looks at it. For that reason, your claim that we CAN'T rewrite USING to a simple ON without changing the semantics seems to me to be false. For purposes of 99.44% of the code, we are already doing it. The only problem is, after we do it, and after using that rewritten version to execute the query, we maintain the fiction that we haven't done it when we dump the view back out, leading to inconsistent results. ...Robert