Thread: Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.

Robert Haas <rhaas@postgresql.org> writes:
> Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
> The previous coding prevented ALTER TABLE .. ADD COLUMN from being used
> with a non-NULL default in situations where the table's rowtype was being
> used elsewhere.  But this is a completely arbitrary restriction since
> you could do the same operation in multiple steps (add the column, add
> the default, update the table).

This is not an "arbitrary restriction" because according to the SQL
standard those operations mean different things.  In the first case you
get a column filled with the default value, in the second case you get a
column filled with nulls.  And the latter case is the only one that
works properly with a rowtype.

Kindly revert this patch.
        regards, tom lane


On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <rhaas@postgresql.org> writes:
>> Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.
>> The previous coding prevented ALTER TABLE .. ADD COLUMN from being used
>> with a non-NULL default in situations where the table's rowtype was being
>> used elsewhere.  But this is a completely arbitrary restriction since
>> you could do the same operation in multiple steps (add the column, add
>> the default, update the table).
>
> This is not an "arbitrary restriction" because according to the SQL
> standard those operations mean different things.  In the first case you
> get a column filled with the default value, in the second case you get a
> column filled with nulls.  And the latter case is the only one that
> works properly with a rowtype.

That's an untenable interpretation.  If you set a default on a table
column, it has no effect on the row type.

rhaas=# create table foo (a int default 1);
CREATE TABLE
rhaas=# create table bar (b foo);
CREATE TABLE
rhaas=# insert into bar values (default);
INSERT 0 1
rhaas=# select * from bar;b
---

(1 row)

I can't see any reason why ALTER TABLE .. ADD COLUMN should have a
different interpretation of what the rowtype means than the rest of
the system.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This is not an "arbitrary restriction" because according to the SQL
>> standard those operations mean different things. �In the first case you
>> get a column filled with the default value, in the second case you get a
>> column filled with nulls. �And the latter case is the only one that
>> works properly with a rowtype.

> That's an untenable interpretation.

No, *your* interpretation is untenable.  The sequence of operations that
the previous coding allowed behaves the same for both the table and
rowtype instances.  The "shortcut" doesn't behave the same.

This was, I believe, discussed at length when the previous coding was
put in.  The fact that you and Noah haven't read the spec carefully
doesn't give you license to change it.
        regards, tom lane


On Wed, Jan 26, 2011 at 11:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jan 26, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This is not an "arbitrary restriction" because according to the SQL
>>> standard those operations mean different things.  In the first case you
>>> get a column filled with the default value, in the second case you get a
>>> column filled with nulls.  And the latter case is the only one that
>>> works properly with a rowtype.
>
>> That's an untenable interpretation.
>
> No, *your* interpretation is untenable.  The sequence of operations that
> the previous coding allowed behaves the same for both the table and
> rowtype instances.  The "shortcut" doesn't behave the same.

By that logic, the following sequence out to be disallowed also:

rhaas=# create table foo (a int default 1);
CREATE TABLE
rhaas=# create table bar (b foo);
CREATE TABLE
rhaas=# insert into bar values (default);
INSERT 0 1
rhaas=# alter table foo alter column a set not null;
ALTER TABLE

> This was, I believe, discussed at length when the previous coding was
> put in.  The fact that you and Noah haven't read the spec carefully
> doesn't give you license to change it.

It's certainly not obvious from the archives from around 2004-06-06
that this was discussed.  Perhaps you could be a bit more specific.
As for the spec, if it requires composite types to have defaults (or
constraints), then we're in violation of that all over the place
anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> It's certainly not obvious from the archives from around 2004-06-06
> that this was discussed.  Perhaps you could be a bit more specific.
> As for the spec, if it requires composite types to have defaults (or
> constraints), then we're in violation of that all over the place
> anyway.

Here's the point: the spec requires that ADD COLUMN with a default cause
the column to spring into existence with the default value inserted in
all existing rows, like this:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo values (1),(2);
INSERT 0 2
regression=# alter table foo add column f2 text default 'hello';
ALTER TABLE
regression=# select * from foo;f1 |  f2   
----+------- 1 | hello 2 | hello
(2 rows)

This is entirely different from what happens when you set the default
afterwards:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo values (1),(2);
INSERT 0 2
regression=# alter table foo add column f2 text ;
ALTER TABLE
regression=# select * from foo;f1 | f2 
----+---- 1 |  2 | 
(2 rows)

regression=# alter table foo alter column f2 set default 'hello';
ALTER TABLE
regression=# select * from foo;f1 | f2 
----+---- 1 |  2 | 
(2 rows)

In this case the column springs into existence as nulls, and the
subsequent change of default doesn't change that.

We are currently only capable of supporting the second behavior so far
as instances of the rowtype outside the table itself are concerned.
Eventually we should try to fix that --- and by "fix", I mean support
the spec-required behavior, not implement whatever happens to be easy.
The reason for rejecting the syntax with default is to avoid
establishing a non-spec-compliant precedent that we'd then have to
worry about being backward compatible with.

What your patch does is accept the syntax with ensuing
non-spec-compliant behavior.  That's not a step forward.  If it added
any really useful functionality, then maybe there would be an excuse for
violating the spec here --- but it doesn't.  You can just add the column
without default and change the default afterwards, and get to the same
place without using any non-spec-compliant operations.

And yes, I know that we're not doing all that well with honoring
defaults (or constraints) for rowtypes.  But that's something to be
fixed.  Enlarging our non-compliance with the spec to gain no useful
functionality isn't an improvement.
        regards, tom lane


On Wed, Jan 26, 2011 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> It's certainly not obvious from the archives from around 2004-06-06
>> that this was discussed.  Perhaps you could be a bit more specific.
>> As for the spec, if it requires composite types to have defaults (or
>> constraints), then we're in violation of that all over the place
>> anyway.
>
> Here's the point: the spec requires that ADD COLUMN with a default cause
> the column to spring into existence with the default value inserted in
> all existing rows, like this:
>
> regression=# create table foo (f1 int);
> CREATE TABLE
> regression=# insert into foo values (1),(2);
> INSERT 0 2
> regression=# alter table foo add column f2 text default 'hello';
> ALTER TABLE
> regression=# select * from foo;
>  f1 |  f2
> ----+-------
>  1 | hello
>  2 | hello
> (2 rows)
>
> This is entirely different from what happens when you set the default
> afterwards:
>
> regression=# create table foo (f1 int);
> CREATE TABLE
> regression=# insert into foo values (1),(2);
> INSERT 0 2
> regression=# alter table foo add column f2 text ;
> ALTER TABLE
> regression=# select * from foo;
>  f1 | f2
> ----+----
>  1 |
>  2 |
> (2 rows)
>
> regression=# alter table foo alter column f2 set default 'hello';
> ALTER TABLE
> regression=# select * from foo;
>  f1 | f2
> ----+----
>  1 |
>  2 |
> (2 rows)
>
> In this case the column springs into existence as nulls, and the
> subsequent change of default doesn't change that.

I completely understand all of that, and I'm not debating any of it.

> What your patch does is accept the syntax with ensuing
> non-spec-compliant behavior.  That's not a step forward.  If it added
> any really useful functionality, then maybe there would be an excuse for
> violating the spec here --- but it doesn't.  You can just add the column
> without default and change the default afterwards, and get to the same
> place without using any non-spec-compliant operations.
>
> And yes, I know that we're not doing all that well with honoring
> defaults (or constraints) for rowtypes.  But that's something to be
> fixed.  Enlarging our non-compliance with the spec to gain no useful
> functionality isn't an improvement.

Saying "we're not doing all that well" with it is an understatement.
We're not doing it at all, except apparently for this one place.  Your
claim that this is an intentional prohibition is without a shred of
corroborating evidence.  To believe this is really a problem, you'd
have to believe that someone will someday want to make default values
- but not constraints - apply to row types (because, of course, if no
one ever makes default values apply to row types, then this isn't any
more of a problem than it is today, and if they make constraints -
even NOT NULL constraints - apply to row types, then they'll need
logic to recurse through to tables that use the row type anyway).
Considering that this code has been this way for almost seven years
and that you've not so far offered so much as a shred of evidence in
the code, the documentation, the commit logs, or the mailing list
archives that anyone ever put this arbitrary wart in there for that
reason, or any reason other than inadvertency, I find that altogether
insufficient reason to revert the patch.

I am also pretty dubious that the spec requires this behavior, even if
we did support defaults on row types.  Please produce the evidence.
In essence, what you're arguing is that the spec requires us to treat
every instance of a row-type as if it were part of the table that gave
rise to that row-type.  Is "UPDATE foo SET a = 1" required to update
every copy of that rowtype everywhere in the system?  I bet it isn't.
I think you're conflating the table with its row type, and I'd like to
see some prior writing indicating otherwise.

For those following along at home who may wish to express an opinion,
perhaps a brief review of the behavior change we're arguing about will
be helpful.  Prior to this patch, if foo was used as a type in some
other table, this would work:

ALTER TABLE foo ADD COLUMN bar integer;

And this would work:

ALTER TABLE foo ADD COLUMN bar integer DEFAULT null;

But this would fail:

ALTER TABLE foo ADD COLUMN bar integer DEFAULT 5;

...and specifically, it would give you this error message:

cannot alter table "%s" because column "%s"."%s" uses its rowtype

Now, at the very least, that error message sucks, because clearly you
*could* alter that table; you could even add that specific column, and
you could subsequently set a default on it.  You just couldn't do both
at the same time.  With this patch, the operation succeeds: the rows
in the table are updated with the new default, but instances of the
row type in other tables are not updated, so they effectively have a
NULL in that column.  That doesn't seem like a problem to me, because
INSERT statements also ignore the column defaults, so the behavior
you're getting from ALTER TABLE / ADD COLUMN is consistent with what
INSERT does, but Tom disagrees.  Anyone else want to weigh in?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> Saying "we're not doing all that well" with it is an understatement.
> We're not doing it at all, except apparently for this one place.

Yeah, it's mostly broken.  That doesn't mean we should introduce more
breakage in order to gain no meaningful functionality.

> Your
> claim that this is an intentional prohibition is without a shred of
> corroborating evidence.

I put that code in --- see commit bb3da43e3bd2a74c5caf8eca7f47ff4198f479ec
--- and I know damn well that it was an intentional prohibition based on
exactly this reasoning.  Please do not argue Postgres development
history with me.

> To believe this is really a problem, you'd
> have to believe that someone will someday want to make default values
> - but not constraints - apply to row types (because, of course, if no
> one ever makes default values apply to row types, then this isn't any
> more of a problem than it is today, and if they make constraints -
> even NOT NULL constraints - apply to row types, then they'll need
> logic to recurse through to tables that use the row type anyway).

I don't see how you come to the conclusion that constraints shouldn't
apply too.  But in any case, the reason for wanting this to happen is
that the SQL spec says so.  It looks like a lot of work, which is why
nobody's tackled it yet, but that doesn't mean it never will happen.

As for the logic to recurse through row types, we have half of it
already (see above commit).  What we don't have is logic to do something
about it once we've found a column that would need changing.

> I am also pretty dubious that the spec requires this behavior, even if
> we did support defaults on row types.  Please produce the evidence.

Please produce the evidence that it doesn't.  The spec is perfectly
clear on what ALTER ADD COLUMN with DEFAULT does to tables, and it also
supports default values attached to the columns of structured types,
and it says (eg in SQL:2008 11.45 <add attribute definition>) that
adding an attribute to a user-defined type results in the same state as
if the attribute had been there all along.  One would think that that
means that the attribute appears with its default value, since you
necessarily have not done anything to modify that value in any
existing instance of the composite type.  Note that essentially the same
wording is used in ALTER ADD COLUMN for tables, SQL:2008 11.11 <add
column definition> general rule 4.

> In essence, what you're arguing is that the spec requires us to treat
> every instance of a row-type as if it were part of the table that gave
> rise to that row-type.

Not at all.  The point is that defaults *are* part of structured types
according to the spec, so they should apply to instances of those
structured types.  We fail to handle that at the moment but the
expectation has always been that we'd get it done someday.

> I think you're conflating the table with its row type, and I'd like to
> see some prior writing indicating otherwise.

I will agree that a language lawyer could argue that a table rowtype
doesn't have to act like a separately-declared composite type, but
that surely doesn't meet the POLA.  I think that an ALTER TABLE on a
table ought to have the implied effects of a spec-compliant ALTER TYPE
on the associated composite type.

> cannot alter table "%s" because column "%s"."%s" uses its rowtype
> Now, at the very least, that error message sucks,

I'll agree with that, but the answer to that is to improve the error
message, not to introduce non-spec-compliant behavior that we will
someday regret.
        regards, tom lane


Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.

From
Alvaro Herrera
Date:
Excerpts from Robert Haas's message of mié ene 26 14:43:08 -0300 2011:

> For those following along at home who may wish to express an opinion,
> perhaps a brief review of the behavior change we're arguing about will
> be helpful.  Prior to this patch, if foo was used as a type in some
> other table, this would work:
> 
> ALTER TABLE foo ADD COLUMN bar integer;
> 
> And this would work:
> 
> ALTER TABLE foo ADD COLUMN bar integer DEFAULT null;
> 
> But this would fail:
> 
> ALTER TABLE foo ADD COLUMN bar integer DEFAULT 5;
> 
> ...and specifically, it would give you this error message:
> 
> cannot alter table "%s" because column "%s"."%s" uses its rowtype
> 
> Now, at the very least, that error message sucks, because clearly you
> *could* alter that table; you could even add that specific column, and
> you could subsequently set a default on it.  You just couldn't do both
> at the same time.  With this patch, the operation succeeds: the rows
> in the table are updated with the new default, but instances of the
> row type in other tables are not updated, so they effectively have a
> NULL in that column.

If you really want to do what you seem to want (i.e. add a column with a
default and not have it alter existing rows), you can already do it like
this:

ALTER TABLE foo ADD COLUMN bar INTEGER, ALTER COLUMN bar SET DEFAULT 5;

If there's an intention to improve ALTER TABLE so that it propagates the
new default to existing tuples in other tables, I have no problem with
it throwing an error now.  Perhaps suggest the above syntax in a hint or
something.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [COMMITTERS] pgsql: Remove arbitrary ALTER TABLE .. ADD COLUMN restriction.

From
"Kevin Grittner"
Date:
Alvaro Herrera <alvherre@commandprompt.com> wrote:
> If you really want to do what you seem to want (i.e. add a column
> with a default and not have it alter existing rows), you can
> already do it like this:
> 
> ALTER TABLE foo ADD COLUMN bar INTEGER, ALTER COLUMN bar SET
> DEFAULT 5;
> 
> If there's an intention to improve ALTER TABLE so that it
> propagates the new default to existing tuples in other tables, I
> have no problem with it throwing an error now.  Perhaps suggest
> the above syntax in a hint or something.
+1
I haven't reviewed the standard in this regard (always a painful
experience), but I believe Tom.  The behavior suggested by Robert
would surprise *me*, at least.
-Kevin


On Wed, Jan 26, 2011 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think you're conflating the table with its row type, and I'd like to
>> see some prior writing indicating otherwise.
>
> I will agree that a language lawyer could argue that a table rowtype
> doesn't have to act like a separately-declared composite type, but
> that surely doesn't meet the POLA.

Well, actually, what I thought was that the rowtype *should* act
exactly like a separately-declared composite rowtype.  Which is to
say, it shouldn't have a default, because a separately-declared
composite rowtype *can't have a default*.  If that's not the consensus
position, so be it, but it made sense to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 26, 2011 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I will agree that a language lawyer could argue that a table rowtype
>> doesn't have to act like a separately-declared composite type, but
>> that surely doesn't meet the POLA.

> Well, actually, what I thought was that the rowtype *should* act
> exactly like a separately-declared composite rowtype.  Which is to
> say, it shouldn't have a default, because a separately-declared
> composite rowtype *can't have a default*.  If that's not the consensus
> position, so be it, but it made sense to me.

The fact that a separately-declared composite type can't have defaults
is an implementation restriction.  It is a feature required by SQL spec,
so we ought to plan on doing it someday, IMO.

It's conceivable that once we have that implemented, we will decide that
table rowtypes should act differently from separately-declared composite
types to the extent of not honoring defaults inherited from their table.
That would be a terrible violation of POLA in my opinion, but we might
have to do it for backwards compatibility's sake.

What I *don't* want to do is introduce another not-per-spec behavior
that we will have to make such hard choices about when the time comes,
when we aren't getting any meaningful functionality increment out of
allowing the not-per-spec behavior.  And that's exactly the situation
with this ALTER case.
        regards, tom lane


On Wed, Jan 26, 2011 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, actually, what I thought was that the rowtype *should* act
>> exactly like a separately-declared composite rowtype.  Which is to
>> say, it shouldn't have a default, because a separately-declared
>> composite rowtype *can't have a default*.  If that's not the consensus
>> position, so be it, but it made sense to me.
>
> The fact that a separately-declared composite type can't have defaults
> is an implementation restriction.  It is a feature required by SQL spec,
> so we ought to plan on doing it someday, IMO.

OK.  Well, I think we need to document that somewhere, then, at least
in a comment; maybe in the documentation.

> It's conceivable that once we have that implemented, we will decide that
> table rowtypes should act differently from separately-declared composite
> types to the extent of not honoring defaults inherited from their table.
> That would be a terrible violation of POLA in my opinion, but we might
> have to do it for backwards compatibility's sake.

No, I wouldn't support that at all.  I was simply assuming that there
was no intention for composite types to ever support defaults or
constraints, and like I say, we don't seem to worry about it anywhere
else (INSERT, SET NOT NULL, etc).  I maintain it's pretty inconsistent
to do it only in this case, regardless of whether it's technically a
standards-compliance regression.  The patch may make us less
consistent with the SQL spec, and it sounds like there are a couple
other votes for not doing it on that basis, but it makes us more
consistent with ourselves.  If we ever support defaults and
constraints on composite types generally, then the behavior you're
imagining would seems like it would be the right thing to do.
Considering the number of OTHER places we'd have to break backward
compatibility, one more wouldn't bother me any, but apparently that's
just me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> Considering the number of OTHER places we'd have to break backward
> compatibility, one more wouldn't bother me any, but apparently that's
> just me.

Well, again, it'd be all right with me if we were going to get any
meaningful increment in functionality out of it, but we aren't.  If you
add the column and the default in separate steps, you get the same
result and all the behavior is spec-compliant.

There's some history here that you may not be familiar with --- we used
to have exactly this bug in regards to the mainline ALTER TABLE case.
Observe the results in PG 7.1:

regression=# create table foo(f1 int);
CREATE
regression=# insert into foo values(1);
INSERT 3151259 1
regression=# insert into foo values(2);
INSERT 3151260 1
regression=# alter table foo add column f2 int default 2;
ALTER
regression=# select * from foo;f1 | f2 
----+---- 1 |    2 |   
(2 rows)

In 7.2 through 7.4 the ALTER failed instead:
regression=# alter table foo add column f2 int default 2;
ERROR:  Adding columns with defaults is not implemented.       Add the column, then use ALTER TABLE SET DEFAULT.

and by 8.0 we'd finally made it work per spec.  But we got lots of flak
meanwhile from people who'd gotten used to the old behavior.  So those
of us who were around then aren't eager to repeat that.  The code you're
complaining about now was put in only a month after we got that problem
fixed, so the issue was plenty fresh in mind at the time.
        regards, tom lane


On Wed, Jan 26, 2011 at 4:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Considering the number of OTHER places we'd have to break backward
>> compatibility, one more wouldn't bother me any, but apparently that's
>> just me.
>
> Well, again, it'd be all right with me if we were going to get any
> meaningful increment in functionality out of it, but we aren't.  If you
> add the column and the default in separate steps, you get the same
> result and all the behavior is spec-compliant.
>
> There's some history here that you may not be familiar with --- we used
> to have exactly this bug in regards to the mainline ALTER TABLE case.
> Observe the results in PG 7.1:
>
> regression=# create table foo(f1 int);
> CREATE
> regression=# insert into foo values(1);
> INSERT 3151259 1
> regression=# insert into foo values(2);
> INSERT 3151260 1
> regression=# alter table foo add column f2 int default 2;
> ALTER
> regression=# select * from foo;
>  f1 | f2
> ----+----
>  1 |
>  2 |
> (2 rows)
>
> In 7.2 through 7.4 the ALTER failed instead:
> regression=# alter table foo add column f2 int default 2;
> ERROR:  Adding columns with defaults is not implemented.
>        Add the column, then use ALTER TABLE SET DEFAULT.
>
> and by 8.0 we'd finally made it work per spec.  But we got lots of flak
> meanwhile from people who'd gotten used to the old behavior.  So those
> of us who were around then aren't eager to repeat that.  The code you're
> complaining about now was put in only a month after we got that problem
> fixed, so the issue was plenty fresh in mind at the time.

Yeah, I wasn't aware of that.  I'll go revert, but I think I'll also
add a big fat comment, because this is entirely non-obvious,
especially because we don't disallow the cases SET NOT NULL and ADD
table_constraint, which have the same darn problem.  Aren't people who
are used to those cases going to be pretty surprised too?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> Yeah, I wasn't aware of that.  I'll go revert, but I think I'll also
> add a big fat comment, because this is entirely non-obvious,

What I think would actually be helpful would be to improve the error
message.  I'm not sure if it's practical to pass down the specific
reason(s) why we need to throw error, but if not, we could consider a
wishy-washy HINT along the lines of "This could be because of X, Y,
or Z".
        regards, tom lane