Thread: Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
Robert Haas <rhaas@postgresql.org> writes: > Remove arbitrary ALTER TABLE .. ADD COLUMN restriction. > The previous coding prevented ALTER TABLE .. ADD COLUMN from being used > with a non-NULL default in situations where the table's rowtype was being > used elsewhere. But this is a completely arbitrary restriction since > you could do the same operation in multiple steps (add the column, add > the default, update the table). This is not an "arbitrary restriction" because according to the SQL standard those operations mean different things. In the first case you get a column filled with the default value, in the second case you get a column filled with nulls. And the latter case is the only one that works properly with a rowtype. Kindly revert this patch. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <rhaas@postgresql.org> writes: >> Remove arbitrary ALTER TABLE .. ADD COLUMN restriction. >> The previous coding prevented ALTER TABLE .. ADD COLUMN from being used >> with a non-NULL default in situations where the table's rowtype was being >> used elsewhere. But this is a completely arbitrary restriction since >> you could do the same operation in multiple steps (add the column, add >> the default, update the table). > > This is not an "arbitrary restriction" because according to the SQL > standard those operations mean different things. In the first case you > get a column filled with the default value, in the second case you get a > column filled with nulls. And the latter case is the only one that > works properly with a rowtype. That's an untenable interpretation. If you set a default on a table column, it has no effect on the row type. rhaas=# create table foo (a int default 1); CREATE TABLE rhaas=# create table bar (b foo); CREATE TABLE rhaas=# insert into bar values (default); INSERT 0 1 rhaas=# select * from bar;b --- (1 row) I can't see any reason why ALTER TABLE .. ADD COLUMN should have a different interpretation of what the rowtype means than the rest of the system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This is not an "arbitrary restriction" because according to the SQL >> standard those operations mean different things. �In the first case you >> get a column filled with the default value, in the second case you get a >> column filled with nulls. �And the latter case is the only one that >> works properly with a rowtype. > That's an untenable interpretation. No, *your* interpretation is untenable. The sequence of operations that the previous coding allowed behaves the same for both the table and rowtype instances. The "shortcut" doesn't behave the same. This was, I believe, discussed at length when the previous coding was put in. The fact that you and Noah haven't read the spec carefully doesn't give you license to change it. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 11:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> This is not an "arbitrary restriction" because according to the SQL >>> standard those operations mean different things. In the first case you >>> get a column filled with the default value, in the second case you get a >>> column filled with nulls. And the latter case is the only one that >>> works properly with a rowtype. > >> That's an untenable interpretation. > > No, *your* interpretation is untenable. The sequence of operations that > the previous coding allowed behaves the same for both the table and > rowtype instances. The "shortcut" doesn't behave the same. By that logic, the following sequence out to be disallowed also: rhaas=# create table foo (a int default 1); CREATE TABLE rhaas=# create table bar (b foo); CREATE TABLE rhaas=# insert into bar values (default); INSERT 0 1 rhaas=# alter table foo alter column a set not null; ALTER TABLE > This was, I believe, discussed at length when the previous coding was > put in. The fact that you and Noah haven't read the spec carefully > doesn't give you license to change it. It's certainly not obvious from the archives from around 2004-06-06 that this was discussed. Perhaps you could be a bit more specific. As for the spec, if it requires composite types to have defaults (or constraints), then we're in violation of that all over the place anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > It's certainly not obvious from the archives from around 2004-06-06 > that this was discussed. Perhaps you could be a bit more specific. > As for the spec, if it requires composite types to have defaults (or > constraints), then we're in violation of that all over the place > anyway. Here's the point: the spec requires that ADD COLUMN with a default cause the column to spring into existence with the default value inserted in all existing rows, like this: regression=# create table foo (f1 int); CREATE TABLE regression=# insert into foo values (1),(2); INSERT 0 2 regression=# alter table foo add column f2 text default 'hello'; ALTER TABLE regression=# select * from foo;f1 | f2 ----+------- 1 | hello 2 | hello (2 rows) This is entirely different from what happens when you set the default afterwards: regression=# create table foo (f1 int); CREATE TABLE regression=# insert into foo values (1),(2); INSERT 0 2 regression=# alter table foo add column f2 text ; ALTER TABLE regression=# select * from foo;f1 | f2 ----+---- 1 | 2 | (2 rows) regression=# alter table foo alter column f2 set default 'hello'; ALTER TABLE regression=# select * from foo;f1 | f2 ----+---- 1 | 2 | (2 rows) In this case the column springs into existence as nulls, and the subsequent change of default doesn't change that. We are currently only capable of supporting the second behavior so far as instances of the rowtype outside the table itself are concerned. Eventually we should try to fix that --- and by "fix", I mean support the spec-required behavior, not implement whatever happens to be easy. The reason for rejecting the syntax with default is to avoid establishing a non-spec-compliant precedent that we'd then have to worry about being backward compatible with. What your patch does is accept the syntax with ensuing non-spec-compliant behavior. That's not a step forward. If it added any really useful functionality, then maybe there would be an excuse for violating the spec here --- but it doesn't. You can just add the column without default and change the default afterwards, and get to the same place without using any non-spec-compliant operations. And yes, I know that we're not doing all that well with honoring defaults (or constraints) for rowtypes. But that's something to be fixed. Enlarging our non-compliance with the spec to gain no useful functionality isn't an improvement. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> It's certainly not obvious from the archives from around 2004-06-06 >> that this was discussed. Perhaps you could be a bit more specific. >> As for the spec, if it requires composite types to have defaults (or >> constraints), then we're in violation of that all over the place >> anyway. > > Here's the point: the spec requires that ADD COLUMN with a default cause > the column to spring into existence with the default value inserted in > all existing rows, like this: > > regression=# create table foo (f1 int); > CREATE TABLE > regression=# insert into foo values (1),(2); > INSERT 0 2 > regression=# alter table foo add column f2 text default 'hello'; > ALTER TABLE > regression=# select * from foo; > f1 | f2 > ----+------- > 1 | hello > 2 | hello > (2 rows) > > This is entirely different from what happens when you set the default > afterwards: > > regression=# create table foo (f1 int); > CREATE TABLE > regression=# insert into foo values (1),(2); > INSERT 0 2 > regression=# alter table foo add column f2 text ; > ALTER TABLE > regression=# select * from foo; > f1 | f2 > ----+---- > 1 | > 2 | > (2 rows) > > regression=# alter table foo alter column f2 set default 'hello'; > ALTER TABLE > regression=# select * from foo; > f1 | f2 > ----+---- > 1 | > 2 | > (2 rows) > > In this case the column springs into existence as nulls, and the > subsequent change of default doesn't change that. I completely understand all of that, and I'm not debating any of it. > What your patch does is accept the syntax with ensuing > non-spec-compliant behavior. That's not a step forward. If it added > any really useful functionality, then maybe there would be an excuse for > violating the spec here --- but it doesn't. You can just add the column > without default and change the default afterwards, and get to the same > place without using any non-spec-compliant operations. > > And yes, I know that we're not doing all that well with honoring > defaults (or constraints) for rowtypes. But that's something to be > fixed. Enlarging our non-compliance with the spec to gain no useful > functionality isn't an improvement. Saying "we're not doing all that well" with it is an understatement. We're not doing it at all, except apparently for this one place. Your claim that this is an intentional prohibition is without a shred of corroborating evidence. To believe this is really a problem, you'd have to believe that someone will someday want to make default values - but not constraints - apply to row types (because, of course, if no one ever makes default values apply to row types, then this isn't any more of a problem than it is today, and if they make constraints - even NOT NULL constraints - apply to row types, then they'll need logic to recurse through to tables that use the row type anyway). Considering that this code has been this way for almost seven years and that you've not so far offered so much as a shred of evidence in the code, the documentation, the commit logs, or the mailing list archives that anyone ever put this arbitrary wart in there for that reason, or any reason other than inadvertency, I find that altogether insufficient reason to revert the patch. I am also pretty dubious that the spec requires this behavior, even if we did support defaults on row types. Please produce the evidence. In essence, what you're arguing is that the spec requires us to treat every instance of a row-type as if it were part of the table that gave rise to that row-type. Is "UPDATE foo SET a = 1" required to update every copy of that rowtype everywhere in the system? I bet it isn't. I think you're conflating the table with its row type, and I'd like to see some prior writing indicating otherwise. For those following along at home who may wish to express an opinion, perhaps a brief review of the behavior change we're arguing about will be helpful. Prior to this patch, if foo was used as a type in some other table, this would work: ALTER TABLE foo ADD COLUMN bar integer; And this would work: ALTER TABLE foo ADD COLUMN bar integer DEFAULT null; But this would fail: ALTER TABLE foo ADD COLUMN bar integer DEFAULT 5; ...and specifically, it would give you this error message: cannot alter table "%s" because column "%s"."%s" uses its rowtype Now, at the very least, that error message sucks, because clearly you *could* alter that table; you could even add that specific column, and you could subsequently set a default on it. You just couldn't do both at the same time. With this patch, the operation succeeds: the rows in the table are updated with the new default, but instances of the row type in other tables are not updated, so they effectively have a NULL in that column. That doesn't seem like a problem to me, because INSERT statements also ignore the column defaults, so the behavior you're getting from ALTER TABLE / ADD COLUMN is consistent with what INSERT does, but Tom disagrees. Anyone else want to weigh in? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Saying "we're not doing all that well" with it is an understatement. > We're not doing it at all, except apparently for this one place. Yeah, it's mostly broken. That doesn't mean we should introduce more breakage in order to gain no meaningful functionality. > Your > claim that this is an intentional prohibition is without a shred of > corroborating evidence. I put that code in --- see commit bb3da43e3bd2a74c5caf8eca7f47ff4198f479ec --- and I know damn well that it was an intentional prohibition based on exactly this reasoning. Please do not argue Postgres development history with me. > To believe this is really a problem, you'd > have to believe that someone will someday want to make default values > - but not constraints - apply to row types (because, of course, if no > one ever makes default values apply to row types, then this isn't any > more of a problem than it is today, and if they make constraints - > even NOT NULL constraints - apply to row types, then they'll need > logic to recurse through to tables that use the row type anyway). I don't see how you come to the conclusion that constraints shouldn't apply too. But in any case, the reason for wanting this to happen is that the SQL spec says so. It looks like a lot of work, which is why nobody's tackled it yet, but that doesn't mean it never will happen. As for the logic to recurse through row types, we have half of it already (see above commit). What we don't have is logic to do something about it once we've found a column that would need changing. > I am also pretty dubious that the spec requires this behavior, even if > we did support defaults on row types. Please produce the evidence. Please produce the evidence that it doesn't. The spec is perfectly clear on what ALTER ADD COLUMN with DEFAULT does to tables, and it also supports default values attached to the columns of structured types, and it says (eg in SQL:2008 11.45 <add attribute definition>) that adding an attribute to a user-defined type results in the same state as if the attribute had been there all along. One would think that that means that the attribute appears with its default value, since you necessarily have not done anything to modify that value in any existing instance of the composite type. Note that essentially the same wording is used in ALTER ADD COLUMN for tables, SQL:2008 11.11 <add column definition> general rule 4. > In essence, what you're arguing is that the spec requires us to treat > every instance of a row-type as if it were part of the table that gave > rise to that row-type. Not at all. The point is that defaults *are* part of structured types according to the spec, so they should apply to instances of those structured types. We fail to handle that at the moment but the expectation has always been that we'd get it done someday. > I think you're conflating the table with its row type, and I'd like to > see some prior writing indicating otherwise. I will agree that a language lawyer could argue that a table rowtype doesn't have to act like a separately-declared composite type, but that surely doesn't meet the POLA. I think that an ALTER TABLE on a table ought to have the implied effects of a spec-compliant ALTER TYPE on the associated composite type. > cannot alter table "%s" because column "%s"."%s" uses its rowtype > Now, at the very least, that error message sucks, I'll agree with that, but the answer to that is to improve the error message, not to introduce non-spec-compliant behavior that we will someday regret. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Alvaro Herrera
Date:
Excerpts from Robert Haas's message of mié ene 26 14:43:08 -0300 2011: > For those following along at home who may wish to express an opinion, > perhaps a brief review of the behavior change we're arguing about will > be helpful. Prior to this patch, if foo was used as a type in some > other table, this would work: > > ALTER TABLE foo ADD COLUMN bar integer; > > And this would work: > > ALTER TABLE foo ADD COLUMN bar integer DEFAULT null; > > But this would fail: > > ALTER TABLE foo ADD COLUMN bar integer DEFAULT 5; > > ...and specifically, it would give you this error message: > > cannot alter table "%s" because column "%s"."%s" uses its rowtype > > Now, at the very least, that error message sucks, because clearly you > *could* alter that table; you could even add that specific column, and > you could subsequently set a default on it. You just couldn't do both > at the same time. With this patch, the operation succeeds: the rows > in the table are updated with the new default, but instances of the > row type in other tables are not updated, so they effectively have a > NULL in that column. If you really want to do what you seem to want (i.e. add a column with a default and not have it alter existing rows), you can already do it like this: ALTER TABLE foo ADD COLUMN bar INTEGER, ALTER COLUMN bar SET DEFAULT 5; If there's an intention to improve ALTER TABLE so that it propagates the new default to existing tuples in other tables, I have no problem with it throwing an error now. Perhaps suggest the above syntax in a hint or something. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
"Kevin Grittner"
Date:
Alvaro Herrera <alvherre@commandprompt.com> wrote: > If you really want to do what you seem to want (i.e. add a column > with a default and not have it alter existing rows), you can > already do it like this: > > ALTER TABLE foo ADD COLUMN bar INTEGER, ALTER COLUMN bar SET > DEFAULT 5; > > If there's an intention to improve ALTER TABLE so that it > propagates the new default to existing tuples in other tables, I > have no problem with it throwing an error now. Perhaps suggest > the above syntax in a hint or something. +1 I haven't reviewed the standard in this regard (always a painful experience), but I believe Tom. The behavior suggested by Robert would surprise *me*, at least. -Kevin
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think you're conflating the table with its row type, and I'd like to >> see some prior writing indicating otherwise. > > I will agree that a language lawyer could argue that a table rowtype > doesn't have to act like a separately-declared composite type, but > that surely doesn't meet the POLA. Well, actually, what I thought was that the rowtype *should* act exactly like a separately-declared composite rowtype. Which is to say, it shouldn't have a default, because a separately-declared composite rowtype *can't have a default*. If that's not the consensus position, so be it, but it made sense to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jan 26, 2011 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I will agree that a language lawyer could argue that a table rowtype >> doesn't have to act like a separately-declared composite type, but >> that surely doesn't meet the POLA. > Well, actually, what I thought was that the rowtype *should* act > exactly like a separately-declared composite rowtype. Which is to > say, it shouldn't have a default, because a separately-declared > composite rowtype *can't have a default*. If that's not the consensus > position, so be it, but it made sense to me. The fact that a separately-declared composite type can't have defaults is an implementation restriction. It is a feature required by SQL spec, so we ought to plan on doing it someday, IMO. It's conceivable that once we have that implemented, we will decide that table rowtypes should act differently from separately-declared composite types to the extent of not honoring defaults inherited from their table. That would be a terrible violation of POLA in my opinion, but we might have to do it for backwards compatibility's sake. What I *don't* want to do is introduce another not-per-spec behavior that we will have to make such hard choices about when the time comes, when we aren't getting any meaningful functionality increment out of allowing the not-per-spec behavior. And that's exactly the situation with this ALTER case. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, actually, what I thought was that the rowtype *should* act >> exactly like a separately-declared composite rowtype. Which is to >> say, it shouldn't have a default, because a separately-declared >> composite rowtype *can't have a default*. If that's not the consensus >> position, so be it, but it made sense to me. > > The fact that a separately-declared composite type can't have defaults > is an implementation restriction. It is a feature required by SQL spec, > so we ought to plan on doing it someday, IMO. OK. Well, I think we need to document that somewhere, then, at least in a comment; maybe in the documentation. > It's conceivable that once we have that implemented, we will decide that > table rowtypes should act differently from separately-declared composite > types to the extent of not honoring defaults inherited from their table. > That would be a terrible violation of POLA in my opinion, but we might > have to do it for backwards compatibility's sake. No, I wouldn't support that at all. I was simply assuming that there was no intention for composite types to ever support defaults or constraints, and like I say, we don't seem to worry about it anywhere else (INSERT, SET NOT NULL, etc). I maintain it's pretty inconsistent to do it only in this case, regardless of whether it's technically a standards-compliance regression. The patch may make us less consistent with the SQL spec, and it sounds like there are a couple other votes for not doing it on that basis, but it makes us more consistent with ourselves. If we ever support defaults and constraints on composite types generally, then the behavior you're imagining would seems like it would be the right thing to do. Considering the number of OTHER places we'd have to break backward compatibility, one more wouldn't bother me any, but apparently that's just me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Considering the number of OTHER places we'd have to break backward > compatibility, one more wouldn't bother me any, but apparently that's > just me. Well, again, it'd be all right with me if we were going to get any meaningful increment in functionality out of it, but we aren't. If you add the column and the default in separate steps, you get the same result and all the behavior is spec-compliant. There's some history here that you may not be familiar with --- we used to have exactly this bug in regards to the mainline ALTER TABLE case. Observe the results in PG 7.1: regression=# create table foo(f1 int); CREATE regression=# insert into foo values(1); INSERT 3151259 1 regression=# insert into foo values(2); INSERT 3151260 1 regression=# alter table foo add column f2 int default 2; ALTER regression=# select * from foo;f1 | f2 ----+---- 1 | 2 | (2 rows) In 7.2 through 7.4 the ALTER failed instead: regression=# alter table foo add column f2 int default 2; ERROR: Adding columns with defaults is not implemented. Add the column, then use ALTER TABLE SET DEFAULT. and by 8.0 we'd finally made it work per spec. But we got lots of flak meanwhile from people who'd gotten used to the old behavior. So those of us who were around then aren't eager to repeat that. The code you're complaining about now was put in only a month after we got that problem fixed, so the issue was plenty fresh in mind at the time. regards, tom lane
Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
From
Robert Haas
Date:
On Wed, Jan 26, 2011 at 4:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Considering the number of OTHER places we'd have to break backward >> compatibility, one more wouldn't bother me any, but apparently that's >> just me. > > Well, again, it'd be all right with me if we were going to get any > meaningful increment in functionality out of it, but we aren't. If you > add the column and the default in separate steps, you get the same > result and all the behavior is spec-compliant. > > There's some history here that you may not be familiar with --- we used > to have exactly this bug in regards to the mainline ALTER TABLE case. > Observe the results in PG 7.1: > > regression=# create table foo(f1 int); > CREATE > regression=# insert into foo values(1); > INSERT 3151259 1 > regression=# insert into foo values(2); > INSERT 3151260 1 > regression=# alter table foo add column f2 int default 2; > ALTER > regression=# select * from foo; > f1 | f2 > ----+---- > 1 | > 2 | > (2 rows) > > In 7.2 through 7.4 the ALTER failed instead: > regression=# alter table foo add column f2 int default 2; > ERROR: Adding columns with defaults is not implemented. > Add the column, then use ALTER TABLE SET DEFAULT. > > and by 8.0 we'd finally made it work per spec. But we got lots of flak > meanwhile from people who'd gotten used to the old behavior. So those > of us who were around then aren't eager to repeat that. The code you're > complaining about now was put in only a month after we got that problem > fixed, so the issue was plenty fresh in mind at the time. Yeah, I wasn't aware of that. I'll go revert, but I think I'll also add a big fat comment, because this is entirely non-obvious, especially because we don't disallow the cases SET NOT NULL and ADD table_constraint, which have the same darn problem. Aren't people who are used to those cases going to be pretty surprised too? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Yeah, I wasn't aware of that. I'll go revert, but I think I'll also > add a big fat comment, because this is entirely non-obvious, What I think would actually be helpful would be to improve the error message. I'm not sure if it's practical to pass down the specific reason(s) why we need to throw error, but if not, we could consider a wishy-washy HINT along the lines of "This could be because of X, Y, or Z". regards, tom lane