Home > mailing lists

Re: Rearranging ALTER TABLE to avoid multi-operations bugs - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Rearranging ALTER TABLE to avoid multi-operations bugs
Date	January 21, 2020 00:46:52
Msg-id	6758.1579567612@sss.pgh.pa.us Whole thread Raw
In response to	Re: Rearranging ALTER TABLE to avoid multi-operations bugs (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

I wrote:
> The squirrely-ness around identity is that while this now works:

> regression=# CREATE TABLE itest8 (f1 int);
> CREATE TABLE
> regression=# ALTER TABLE itest8
> regression-#   ADD COLUMN f2 int NOT NULL,
> regression-#   ALTER COLUMN f2 ADD GENERATED ALWAYS AS IDENTITY;
> ALTER TABLE

> it doesn't work if there's rows in the table:

> regression=# CREATE TABLE itest8 (f1 int);
> CREATE TABLE
> regression=# insert into itest8 default values;
> INSERT 0 1
> regression=# ALTER TABLE itest8
>   ADD COLUMN f2 int NOT NULL,
>   ALTER COLUMN f2 ADD GENERATED ALWAYS AS IDENTITY;
> ERROR:  column "f2" contains null values

> The same would be true if you tried to do the ALTER as two separate
> operations (because the ADD ... NOT NULL, without a default, will
> naturally fail on a nonempty table).  So I don't feel *too* awful
> about that.  But it'd be better if this worked.

After further poking at that, I've concluded that maybe this is not
a bug but operating as designed.  Adding the GENERATED property in a
separate step is arguably equivalent to setting a plain default in
a separate step, and look at how we handle that:

regression=# create table t1(x int);
CREATE TABLE
regression=# insert into t1 values(1);
INSERT 0 1
regression=# alter table t1 add column y int default 11,
  alter column y set default 12;
ALTER TABLE
regression=# table t1;
 x | y  
---+----
 1 | 11
(1 row)

This is documented, rather opaquely perhaps, for the SET DEFAULT
case:

SET/DROP DEFAULT

    These forms set or remove the default value for a column. Default
    values only apply in subsequent INSERT or UPDATE commands; they do not
    cause rows already in the table to change.

So the design principle here seems to be that we fill the column
using whatever is specified *in the ADD COLUMN subcommand*, and
any screwing-about in other subcommands just affects what the
behavior will be in subsequent INSERT commands.  That's a little
weird but it has potential use-cases.  If we attempt to apply the
"new" default immediately then this syntax devolves to having the
same effects as a simple ADD-COLUMN-with-default.  There's not
a lot of reason to write the longer form if that's what you wanted.

So I'm now inclined to think that the code is all right.  We could
improve the documentation, perhaps, with an explicit example.
Also, the man page's entry for SET GENERATED says nothing of this,
but it likely ought to say the same thing as SET DEFAULT.

Also, we don't really have any test cases proving it works that way.
Simple tests, such as the one above, are not too trustworthy because
the attmissingval optimization tends to hide what's really happening.
(I found this out the hard way while messing with a patch to change
the behavior --- which I now think we shouldn't do, anyhow.)

            regards, tom lane

pgsql-hackers by date:

From: Thomas Munro
Date: 21 January 2020, 00:45:12
Subject: Reducing WaitEventSet syscall churn

From: Tom Lane
Date: 21 January 2020, 00:51:16
Subject: Re: libxml2 is dropping xml2-config

Re: Rearranging ALTER TABLE to avoid multi-operations bugs - Mailing list pgsql-hackers

Previous

Next