Thread: IDENTITY/GENERATED columns
Hi, my last patch didn't make it to the -hackers list, here's a newer one. Let me list what this patch does now: - CREATE TABLE/ALTER TABLE ADD syntax support for GENERATED { ALWAYS | BY DEFAULT } AS { ( expr ) | IDENTITY ( seq_opts ) } - catalog indicators of the above properties - INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE - INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE on tables with GENERATED ALWAYS columns - UPDATE fails when using other than DEFAULT literal for GENERATED ALWAYS columns - GRANT {INSERT|UPDATE|ALL} ON table automatically gives UPDATE permission for the supporting sequence (missing: ALTER TABLE ADD should give permission for the new sequence) - ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N } syntax support to alter the supporting sequence - ALTER TABLE tab RENAME also renames the supporting sequence on both table and column renaming - ALTER TABLE SET/DROP default is disallowed - pg_dump support for exporting the above properties including sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE when the table has GENERATED ALWAYS columns - test cases for some operations - documented With this version, I mostly covered these TODO entries: * Add support for SQL-standard GENERATED/IDENTITY columns (almost done :-) ) * %Disallow changing DEFAULT expression of a SERIAL column? (done) * %Disallow ALTER SEQUENCE changes for SERIAL sequences because pg_dump does not dump the changes (pg_dump dumps the sequence options) * Add DEFAULT .. AS OWNER so permission checks are done as the table owner This would be useful for SERIAL nextval() calls and CHECK constraints. (GRANT TABLE grants UPDATE on the supporting sequence, ALTER TABLE ADD COLUMN will be implemented) * %Have ALTER TABLE RENAME rename SERIAL sequence names (done) * Allow SERIAL sequences to inherit permissions from the base table? (not needed because INSERT is a fastpath, permissions added with GRANT TABLE, and [will be] extended on ALTER TABLE ADD COLUMN) I am considering using ALTER TABLE tab ALTER col TYPE to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY properties to existing columns. As it stands now, ALTER TYPE doesn't change these column attributes. Please, review. Best regards, Zoltán Böszörményi
Attachment
Hi, here's the next, changes are at the end. Zoltan Boszormenyi írta: > Hi, > > my last patch didn't make it to the -hackers list, > here's a newer one. Let me list what this patch does now: > > - CREATE TABLE/ALTER TABLE ADD syntax support for > GENERATED { ALWAYS | BY DEFAULT } AS > { ( expr ) | IDENTITY ( seq_opts ) } > - catalog indicators of the above properties > - INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE > - INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE > on tables with GENERATED ALWAYS columns > - UPDATE fails when using other than DEFAULT literal for > GENERATED ALWAYS columns > - GRANT {INSERT|UPDATE|ALL} ON table automatically > gives UPDATE permission for the supporting sequence > (missing: ALTER TABLE ADD should give permission > for the new sequence) > - ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N } > syntax support to alter the supporting sequence > - ALTER TABLE tab RENAME also renames the supporting sequence > on both table and column renaming > - ALTER TABLE SET/DROP default is disallowed > - pg_dump support for exporting the above properties including > sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE > when the table has GENERATED ALWAYS columns > - test cases for some operations > - documented > > With this version, I mostly covered these TODO entries: > > > * Add support for SQL-standard GENERATED/IDENTITY columns (almost > done :-) ) > * %Disallow changing DEFAULT expression of a SERIAL column? (done) > * %Disallow ALTER SEQUENCE changes for SERIAL sequences because > pg_dump does not dump the changes > (pg_dump dumps the sequence options) > * Add DEFAULT .. AS OWNER so permission checks are done as the table > owner > This would be useful for SERIAL nextval() calls and CHECK > constraints. > (GRANT TABLE grants UPDATE on the supporting sequence, > ALTER TABLE ADD COLUMN will be implemented) > * %Have ALTER TABLE RENAME rename SERIAL sequence names (done) > * Allow SERIAL sequences to inherit permissions from the base table? > (not needed because INSERT is a fastpath, permissions added > with GRANT TABLE, and [will be] extended on ALTER TABLE ADD > COLUMN) > > I am considering using ALTER TABLE tab ALTER col TYPE > to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY > properties to existing columns. As it stands now, ALTER TYPE > doesn't change these column attributes. > > Please, review. > > Best regards, > Zoltán Böszörményi Changes from previous version: - ALTER TABLE ADD col type GENERATED AS IDENTITY adds permission over the newly created sequence to those who already have INSERT or UPDATE on the table. - ALTER TABLE SET/DROP DEFAULT also disallowed on GENERATED ALWAYS columns - some codepaths were consolidated to avoid source bloat A remaining larger problem is that GENERATED ALWAYS AS (expr) should allow column references from the same table, etc. just like CHECK. For the sake of generality, DEFAULT must have the same features, too. E.g. evaluating DEFAULT value should be done as a second phase, after evaluating other columns with given constants or from DEFAULT that doesn't depend on other columns. Circular dependencies must be avoided, etc. I talked about implementin setting and dropping GENERATED / IDENTITY properties. I cooked up this syntax: ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS ( expr ) Behaves the same as "SET DEFAULT expr" but sets attforceddef. ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS IDENTITY [( seq_opts )] It should create supporting sequence iff the column isn't already an IDENTITY column. ALTER TABLE tab ALTER col DROP IDENTITY It should be spelled loud instead of DROP DEFAULT. That leaves ALTER TABLE ALTER TYPE alone. Then a short check showed that changing such attributes on a column isn't defined by SQL2003. The nice thing about not implementing the above is that it avoids the debate whether SET GENERATED ALWAYS AS [IDENTITY] should also rewrite the records. My answer in the debate would be that GENERATED ALWAYS AS (expr) should naturally do that but it isn't clear at all for GENERATED ALWAYS AS IDENTITY Closing words: I think it is now ready for acceptance, it does all what I wanted it to do and it conforms to SQL2003. Maybe I introduced some memory leaks but I tried to avoid it by carefully inspecting what other functions do. And some documentation details can also be improved. But I think the code is ready. Please, review. Thanks and best regards, Zoltán Böszörményi
Attachment
Hi, here's the next version. Changes: - Extended documentation - Extending permissions to new sequences ALTER TABLE tab ADD col type GENERATED AS IDENTITY didn't work as advertised, now it seems to. - Test case was also extended. - Previously introduced memory leaks were plugged. Really. Now the only feature missing is the previously discussed GENERATED ALWAYS AS ( expr ) so it can be used like this: CREATE TABLE tab ( c1 double, c2 integer, c3 double GENERATED ALWAYS AS ( col1 + col2), c4 SMALLINT GENERATED ALWAYS AS (CASE WHEN c1 > c2 THEN 1 ELSE NULL END) ); What should the following code produce as a result? INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0); This should insert (1.1, 2, 3.1, NULL) UPDATE tab SET c2 = 1; Only c2 changes, so: (1.1, 1, 3.1, NULL) Or should it change to (1.1, 1, 2.1, 1), e.g. recompute all columns that depend on changed columns? UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5; Now what? It should be (3.5, 2, 5.5, 1) But based on current UPDATE behaviour, e.g. values gets computed based on previous values, it becomes (3.5, 2, 2.1, 1) That would really need changing the behaviour of UPDATE. Currently, if I do an UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2; then c3 gets its value based on the previous content of the record. For the above GENERATED ALWAYS AS (expr) construct to work, UPDATE have to compute the column values in multipass, something like this: constant values are computed; while (is there any non-computed columns) { newly_computed = 0; foreach (column, non-computed-columns) { if (column value depends only on computed columns) { compute it; newly_computed++; } } if (newly_computed == 0) elog(ERROR, "circular dependency"); } This behaviour change would enable something like this: CREATE tab2 (c1 integer, c2 integer, c3 integer); INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2); Does this described behaviour have any precedent or standard compliance? Best regards, Zoltán Böszörményi
Attachment
This is being done for 8.3, right? --------------------------------------------------------------------------- Zoltan Boszormenyi wrote: > Hi, > > here's the next version. Changes: > - Extended documentation > - Extending permissions to new sequences > ALTER TABLE tab ADD col type GENERATED AS IDENTITY > didn't work as advertised, now it seems to. > - Test case was also extended. > - Previously introduced memory leaks were plugged. Really. > > Now the only feature missing is the previously discussed > GENERATED ALWAYS AS ( expr ) so it can be used like this: > > CREATE TABLE tab ( > c1 double, > c2 integer, > c3 double GENERATED ALWAYS AS ( col1 + col2), > c4 SMALLINT GENERATED ALWAYS AS > (CASE WHEN c1 > c2 THEN 1 ELSE NULL END) > ); > > What should the following code produce as a result? > > INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0); > > This should insert (1.1, 2, 3.1, NULL) > > UPDATE tab SET c2 = 1; > > Only c2 changes, so: (1.1, 1, 3.1, NULL) > Or should it change to (1.1, 1, 2.1, 1), > e.g. recompute all columns that depend on > changed columns? > > UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5; > > Now what? It should be (3.5, 2, 5.5, 1) > But based on current UPDATE behaviour, > e.g. values gets computed based on previous > values, it becomes (3.5, 2, 2.1, 1) > > That would really need changing the behaviour of UPDATE. > Currently, if I do an > > UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2; > > then c3 gets its value based on the previous content > of the record. For the above GENERATED ALWAYS > AS (expr) construct to work, UPDATE have to compute > the column values in multipass, something like this: > > constant values are computed; > while (is there any non-computed columns) > { > newly_computed = 0; > foreach (column, non-computed-columns) > { > if (column value depends only on computed columns) > { > compute it; > newly_computed++; > } > } > if (newly_computed == 0) > elog(ERROR, "circular dependency"); > } > > This behaviour change would enable something like this: > CREATE tab2 (c1 integer, c2 integer, c3 integer); > INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2); > > Does this described behaviour have any precedent or > standard compliance? > > Best regards, > Zolt?n B?sz?rm?nyi > [ application/x-tar is not supported, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Yes, I am not ready with it. Bruce Momjian írta: > This is being done for 8.3, right? > > --------------------------------------------------------------------------- > > Zoltan Boszormenyi wrote: > >> Hi, >> >> here's the next version. Changes: >> - Extended documentation >> - Extending permissions to new sequences >> ALTER TABLE tab ADD col type GENERATED AS IDENTITY >> didn't work as advertised, now it seems to. >> - Test case was also extended. >> - Previously introduced memory leaks were plugged. Really. >> >> Now the only feature missing is the previously discussed >> GENERATED ALWAYS AS ( expr ) so it can be used like this: >> >> CREATE TABLE tab ( >> c1 double, >> c2 integer, >> c3 double GENERATED ALWAYS AS ( col1 + col2), >> c4 SMALLINT GENERATED ALWAYS AS >> (CASE WHEN c1 > c2 THEN 1 ELSE NULL END) >> ); >> >> What should the following code produce as a result? >> >> INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0); >> >> This should insert (1.1, 2, 3.1, NULL) >> >> UPDATE tab SET c2 = 1; >> >> Only c2 changes, so: (1.1, 1, 3.1, NULL) >> Or should it change to (1.1, 1, 2.1, 1), >> e.g. recompute all columns that depend on >> changed columns? >> >> UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5; >> >> Now what? It should be (3.5, 2, 5.5, 1) >> But based on current UPDATE behaviour, >> e.g. values gets computed based on previous >> values, it becomes (3.5, 2, 2.1, 1) >> >> That would really need changing the behaviour of UPDATE. >> Currently, if I do an >> >> UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2; >> >> then c3 gets its value based on the previous content >> of the record. For the above GENERATED ALWAYS >> AS (expr) construct to work, UPDATE have to compute >> the column values in multipass, something like this: >> >> constant values are computed; >> while (is there any non-computed columns) >> { >> newly_computed = 0; >> foreach (column, non-computed-columns) >> { >> if (column value depends only on computed columns) >> { >> compute it; >> newly_computed++; >> } >> } >> if (newly_computed == 0) >> elog(ERROR, "circular dependency"); >> } >> >> This behaviour change would enable something like this: >> CREATE tab2 (c1 integer, c2 integer, c3 integer); >> INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2); >> >> Does this described behaviour have any precedent or >> standard compliance? >> >> Best regards, >> Zolt?n B?sz?rm?nyi >> >> > > [ application/x-tar is not supported, skipping... ] > > >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: don't forget to increase your free space map settings >> > >