Re: GENERATED ... AS IDENTITY, Was: Re: Feature Freeze - Mailing list pgsql-hackers
From | Rod Taylor |
---|---|
Subject | Re: GENERATED ... AS IDENTITY, Was: Re: Feature Freeze |
Date | |
Msg-id | 1154483175.29712.160.camel@home Whole thread Raw |
In response to | Re: GENERATED ... AS IDENTITY, Was: Re: Feature Freeze (Zoltan Boszormenyi <zboszor@dunaweb.hu>) |
Responses |
Re: GENERATED ... AS IDENTITY, Was: Re: Feature Freeze
|
List | pgsql-hackers |
> > For db restoration (pg_dump), how do you restore to the same values as > > previously if it is always regenerated? By making ALWAYS a suggestion > > for some users instead of always enforced and providing an override > > mechanism for it. I assume it only works for relation owners but I've > > not figured out how the spec does permissions. > > > > <override clause> ::= > > OVERRIDING USER VALUE > > | OVERRIDING SYSTEM VALUE > > > > In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or > > copy for relations with an GENERATED ALWAYS identity column and the > > backend will need to respect that. > > > > Aren't INSERT and COPY distinguished in code paths? Yes, they are separate but they also use the same permission set. Any user can copy into a structure at any time and virtually every restriction will be applied normally (CHECK, DEFAULT, etc.). Copy bypasses Rules, significant parsing overhead since there is no need to look for subselects, and possibly some triggers are bypassed. I'm fairly sure that foreign key triggers fire. In short, COPY doesn't bypass enforcement. GENERATED ALWAYS is part of that enforcement and should be included in that. If it is not included, we cannot recommend GENERATED ALWAYS for uses like recording CURRENT_USER in an audit log since the data could be fudged. > > ALWAYS is really only enforced for anyone who doesn't have permission to > > specify otherwise. > > > > > > Another one that got me is what do you do if you do this: > > > > CREATE TABLE tab (col integer); > > INSERT INTO tab VALUES (10); > > ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY; > > > > What is the value for "tab"."col"? It would seem that the table should > > be rewritten with all values for "col" recalculated -- thus it would be > > '1'. But wait! Can we add the <override clause> here too to keep the old > > values and change the enforcement for new tuples only? > > > > I don't think we should rewrite existing rows because > when it was inserted, the stored value was valid > according to the rules at that time. What if you > have more than one rows in that table? SERIAL has, until recently, been described as a macro. A tool for setting things up quickly but many parts of which can be changed by hand after-ward. It's not exactly a good source for information on how this structure should work. For one, you can easily override the suggested default a serial gives at any time as any user. The intention of ALWAYS is to prevent exactly that behaviour. I don't have an opinion on ALTER TABLE changes for this one way or the other. It was my intention to advise that a group decision is required and some research into what other databases do in this case. I believe MSSQL and DB2 both implement this functionality. Oh, and one more item. These expressions have the same abilities as a CHECK constraint for referencing other columns. This example comes from an IBM Guide: CREATE TABLE T1(c1 INT, c2 DOUBLE, c3 DOUBLE GENERATED ALWAYS AS (c1 + c2), c4 SMALLINT GENERATED ALWAYS AS (CASE WHEN c1 > c2 THEN 1 ELSE NULL END) ); Here is what IBM has to say about ALTER TABLE and GENERATED ALWAYS: GENERATED Specifies that DB2 generates values for the column. ALWAYS Specifies that DB2 will always generate a value for thecolumn when a row is inserted into the table, or whenever the result value of the generation-expression might change. The result of the expression is stored in the table. GENERATED ALWAYS is the recommended option unless data propagation or unload and reload operations are being performed. GENERATED ALWAYS is the required option for generated columns. BY DEFAULT Specifies that DB2 will generate a value for the column when a rowis inserted into the table, or updated, specifying DEFAULTfor the column, unless an explicit value is specified. BY DEFAULT is the recommended option when using data propagation or performing unload and reload operations. identity-options This clause cannotbe specified when adding a column to an existing table. AS (generation-expression) Specifies that the definition of the column is based on an expression.Requires that the table be put in check pending state, using the SET INTEGRITY statement. After the ALTER TABLE statement, the SET INTEGRITY statement with FORCE GENERATED must be usedto update and check all the values in that column against the new expression. For detailson specifying a column with a generation-expression, see "CREATE TABLE". I'm not sure what they're describing, actually. It appears they don't require it to be updated but that the option is available to do so when the equation changes. > Which record gets which value? You cannot know Note that when you add a new column with a default, including nextval of a sequence, it is expected that those values will be assigned by tuple order in the table. CREATE SEQUENCE foo; ALTER TABLE tab ADD col integer DEFAULT nextval('foo'); > which record was inserted first because subsequent > updates may ruin that order before the ALTER TABLE. > And recalculating the max value of col isn't too reliable > if another session is also inserting records. > > And what about non-unique columns? I'm not sure what unique versus non-unique has to do with this. The question I have is what is the meaning of ALWAYS. There is never a guarantee that alter table will succeed in all cases. ALTER TABLE tab ALTER col TYPE bigint USING 2; > Plain SERIALs aren't declared unique automatically, either. > Consider the following: > > CREATE TABLE tab (col integer); > INSERT INTO tab VALUES (10); > ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY( MINVALUE 1 MAXVALUE 12 CYCLE ); --
pgsql-hackers by date: