Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY |
Date | |
Msg-id | CALDaNm3VpnWiGcd4KHg2Mnbg9DWpFQp4ZAzVkxN5ho00M1o=mw@mail.gmail.com Whole thread Raw |
In response to | Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Fri, 15 Nov 2024 at 16:45, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > Thanks for providing the comments > > On Fri, 15 Nov 2024 at 10:59, vignesh C <vignesh21@gmail.com> wrote: > > > > On Thu, 14 Nov 2024 at 15:51, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > Thanks for providing the comments. > > > > > > On Thu, 14 Nov 2024 at 12:22, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Wed, 13 Nov 2024 at 11:15, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > Thanks for providing the comments. > > > > > > > > > > On Tue, 12 Nov 2024 at 12:52, Zhijie Hou (Fujitsu) > > > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > > > On Friday, November 8, 2024 7:06 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > Hi Amit, > > > > > > > > > > > > > > On Thu, 7 Nov 2024 at 11:37, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > On Tue, Nov 5, 2024 at 12:53 PM Shlok Kyal <shlok.kyal.oss@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > To avoid the issue, we can disallow UPDATE/DELETE on table with > > > > > > > > > unpublished generated column as REPLICA IDENTITY. I have attached a > > > > > > > > > patch for the same. > > > > > > > > > > > > > > > > > > > > > > > > > +CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol; UPDATE > > > > > > > > +testpub_gencol SET a = 100 WHERE a = 1; > > > > > > > > +ERROR: cannot update table "testpub_gencol" > > > > > > > > +DETAIL: Column list used by the publication does not cover the > > > > > > > > replica identity. > > > > > > > > > > > > > > > > This is not a correct ERROR message as the publication doesn't have > > > > > > > > any column list associated with it. You have added the code to detect > > > > > > > > this in the column list code path which I think is not required. BTW, > > > > > > > > you also need to consider the latest commit 7054186c4e for this. I > > > > > > > > guess you need to keep another flag in PublicationDesc to detect this > > > > > > > > and then give an appropriate ERROR. > > > > > > > > > > > > > > I have addressed the comments and provided an updated patch. Also, I am > > > > > > > currently working to fix this issue in back branches. > > > > > > > > > > > > Thanks for the patch. I am reviewing it and have some initial comments: > > > > > > > > > > > > > > > > > > 1. > > > > > > + char attgenerated = get_attgenerated(relid, attnum); > > > > > > + > > > > > > > > > > > > I think it's unnecessary to initialize attgenerated here because the value will > > > > > > be overwritten if pubviaroot is true anyway. Also, the get_attgenerated() > > > > > > is not cheap. > > > > > > > > > > > Fixed > > > > > > > > > > > 2. > > > > > > > > > > > > I think the patch missed to check the case when table is marked REPLICA > > > > > > IDENTITY FULL, and generated column is not published: > > > > > > > > > > > > CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL); > > > > > > ALTER TABLE testpub_gencol REPLICA IDENTITY FULL; > > > > > > CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol; > > > > > > UPDATE testpub_gencol SET a = 2; > > > > > > > > > > > > I expected the UPDATE to fail in above case, but it can still pass after applying the patch. > > > > > > > > > > > Fixed > > > > > > > > > > > 3. > > > > > > > > > > > > + * If the publication is FOR ALL TABLES we can skip the validation. > > > > > > + */ > > > > > > > > > > > > This comment seems not clear to me, could you elaborate a bit more on this ? > > > > > > > > > > > I missed to handle the case FOR ALL TABLES. Have removed the comment. > > > > > > > > > > > 4. > > > > > > > > > > > > Also, I think the patch does not handle the FOR ALL TABLE case correctly: > > > > > > > > > > > > CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL); > > > > > > CREATE UNIQUE INDEX testpub_gencol_idx ON testpub_gencol (b); > > > > > > ALTER TABLE testpub_gencol REPLICA IDENTITY USING index testpub_gencol_idx; > > > > > > CREATE PUBLICATION pub_gencol FOR ALL TABLEs; > > > > > > UPDATE testpub_gencol SET a = 2; > > > > > > > > > > > > I expected the UPDATE to fail in above case as well. > > > > > > > > > > > Fixed > > > > > > > > > > > 5. > > > > > > > > > > > > + else if (cmd == CMD_UPDATE && !pubdesc.replident_has_valid_gen_cols) > > > > > > + ereport(ERROR, > > > > > > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > > > > > > + errmsg("cannot update table \"%s\"", > > > > > > + RelationGetRelationName(rel)), > > > > > > + errdetail("REPLICA IDENTITY consists of an unpublished generated column."))); > > > > > > > > > > > > I think it would be better to use lower case "replica identity" to consistent > > > > > > with other existing messages. > > > > > > > > > > > Fixed > > > > > > > > > > I have attached the updated patch here. > > > > > > > > Few comments: > > > > 1) In the first check relation->rd_rel->relispartition also is checked > > > > whereas in the below it is not checked, shouldn't the same check be > > > > there below to avoid few of the function calls which are not required: > > > > + if (pubviaroot && relation->rd_rel->relispartition) > > > > + { > > > > + publish_as_relid = > > > > GetTopMostAncestorInPublication(pubid, ancestors, NULL); > > > > + > > > > + if (!OidIsValid(publish_as_relid)) > > > > + publish_as_relid = relid; > > > > + } > > > > + > > > > > > > > + if (pubviaroot) > > > > + { > > > > + /* attribute name in the child table */ > > > > + char *colname = > > > > get_attname(relid, attnum, false); > > > > + > > > > + /* > > > > + * Determine the attnum for the > > > > attribute name in parent (we > > > > + * are using the column list defined > > > > on the parent). > > > > + */ > > > > + attnum = get_attnum(publish_as_relid, colname); > > > > + attgenerated = > > > > get_attgenerated(publish_as_relid, attnum); > > > > + } > > > > + else > > > > + attgenerated = get_attgenerated(relid, attnum); > > > I have updated the if condititon > > > > > > > 2) I think we could use check_and_fetch_column_list to see that it is > > > > not a column list publication instead of below code: > > > > + if (!puballtables) > > > > + { > > > > + tuple = SearchSysCache2(PUBLICATIONRELMAP, > > > > + > > > > ObjectIdGetDatum(publish_as_relid), > > > > + > > > > ObjectIdGetDatum(pubid)); > > > > + > > > > + if (!HeapTupleIsValid(tuple)) > > > > + return false; > > > > + > > > > + (void) SysCacheGetAttr(PUBLICATIONRELMAP, tuple, > > > > + > > > > Anum_pg_publication_rel_prattrs, > > > > + &isnull); > > > > + > > > > + ReleaseSysCache(tuple); > > > > + } > > > > + > > > > + if(puballtables || isnull) > > > > > > Yes we can use it. I have updated the patch. > > > > > > > 3) Since there is only a single statement, remove the enclosing parenthisis: > > > > + if (!pubform->pubgencols && > > > > + (pubform->pubupdate || pubform->pubdelete) && > > > > + replident_has_unpublished_gen_col(pubid, > > > > relation, ancestors, > > > > + > > > > pubform->pubviaroot, pubform->puballtables)) > > > > + { > > > > + pubdesc->replident_has_valid_gen_cols = false; > > > > + } > > > > > > > Fixed > > > > > > > 4) Pgindent should be run there are few issues: > > > > 4.a) > > > > +extern bool replident_has_unpublished_gen_col(Oid pubid, Relation relation, > > > > + > > > > List *ancestors, bool pubviaroot, bool > > > > puballtables); > > > > 4.b) > > > > + } > > > > + > > > > + if(puballtables || isnull) > > > > + { > > > > + int x; > > > > + Bitmapset *idattrs = NULL; > > > > 4.c) > > > > + * generated column we should error out. > > > > + */ > > > > + if(relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL && > > > > + relation->rd_att->constr && > > > > relation->rd_att->constr->has_generated_stored) > > > > + result = true; > > > > 4.d) > > > > + while ((x = bms_next_member(idattrs, x)) >= 0) > > > > + { > > > > + AttrNumber attnum = (x + > > > > FirstLowInvalidHeapAttributeNumber); > > > > + char attgenerated; > > > > > > > Fixed > > > > > > > 5) You could do this in a single line comment: > > > > + /* > > > > + * Check if any REPLICA IDENTITY column is an generated column. > > > > + */ > > > > + while ((x = bms_next_member(idattrs, x)) >= 0) > > > > > > > > > > Fixed > > > > > > > 6) I felt one of update or delete is enough in this case as the code > > > > path is same: > > > > +UPDATE testpub_gencol SET a = 100 WHERE a = 1; > > > > +DELETE FROM testpub_gencol WHERE a = 100; > > > > + > > > > +-- error - generated column "b" is not published and REPLICA IDENTITY > > > > is set FULL > > > > +ALTER TABLE testpub_gencol REPLICA IDENTITY FULL; > > > > +UPDATE testpub_gencol SET a = 100 WHERE a = 1; > > > > +DELETE FROM testpub_gencol WHERE a = 100; > > > > +DROP PUBLICATION pub_gencol; > > > > + > > > > +-- ok - generated column "b" is published and is part of REPLICA IDENTITY > > > > +CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol with > > > > (publish_generated_columns = true); > > > > +UPDATE testpub_gencol SET a = 100 WHERE a = 1; > > > > +DELETE FROM testpub_gencol WHERE a = 100; > > > > > > Removed the 'DELETE' case. > > > > > > I have addressed the comments and updated the patch. > > > > Few comments: > > 1) Current patch will not handle this scenario where subset of columns > > are specified in the replica identity index: > > CREATE TABLE t1 (a INT not null, a1 int not null, a2 int not null, b > > INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL); > > create unique index idx1_t1 on t1(a, a1); > > > > -- Replica identity will have subset of table columns > > alter table t1 replica identity using index idx1_t1 ; > > insert into t1 values(1,1,1); > > create publication pub1 for table t1; > > > > postgres=# update t1 set a = 2; > > UPDATE 1 > > > > I felt we should throw an error in this case too. > > > I feel the above behaviour is expected. I think we can specify a > subset of columns in the replica identity index as per documentation > [1]. Thoughts? > > > 2) Instead of checking if replica identity has a generated column, can > > we check if the columns that will be published and the columns in the > > replica identity matches: > > + if (pubviaroot && relation->rd_rel->relispartition) > > + { > > + /* attribute name in the child table */ > > + char *colname = > > get_attname(relid, attnum, false); > > + > > + /* > > + * Determine the attnum for the > > attribute name in parent (we > > + * are using the column list defined > > on the parent). > > + */ > > + attnum = get_attnum(publish_as_relid, colname); > > + attgenerated = > > get_attgenerated(publish_as_relid, attnum); > > + } > > + else > > + attgenerated = get_attgenerated(relid, attnum); > > > Fixed > > > 3) publish_as_relid will be set accordingly based on pubviaroot, so it > > need not be initialized: > > + Oid relid = RelationGetRelid(relation); > > + Oid publish_as_relid = RelationGetRelid(relation); > > + bool result = false; > > > Fixed > > I have addressed the comments and attached the updated patch. Few comments: 1) I felt we can return from here after identifying it is replica identity full instead of processing further: + /* + * REPLICA IDENTITY can be FULL only if there is no column list for + * publication. If REPLICA IDENTITY is set as FULL and relation has a + * generated column we should error out. + */ + if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL && + relation->rd_att->constr && + relation->rd_att->constr->has_generated_stored) + result = true; 2) columns bms also should be freed here: /* replica identity column, not covered by the column list */ + if (!bms_is_member(attnum, columns)) + { + result = true; + break; + } + } + + bms_free(idattrs); 3) Error detail message should begin with upper case: 3.a) @@ -809,6 +809,12 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd) errmsg("cannot update table \"%s\"", RelationGetRelationName(rel)), errdetail("Column list used by the publication does not cover the replica identity."))); + else if (cmd == CMD_UPDATE && !pubdesc.replident_has_valid_gen_cols) + ereport(ERROR, + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + errmsg("cannot update table \"%s\"", + RelationGetRelationName(rel)), + errdetail("replica identity consists of an unpublished generated column."))); else if (cmd == CMD_DELETE && !pubdesc.rf_valid_for_delete) ereport(ERROR, (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), 3.b) Similarly here too: errdetail("Column list used by the publication does not cover the replica identity."))); + else if (cmd == CMD_DELETE && !pubdesc.replident_has_valid_gen_cols) + ereport(ERROR, + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + errmsg("cannot delete from table \"%s\"", + RelationGetRelationName(rel)), + errdetail("replica identity consists of an unpublished generated column."))); Regards, Vignesh
pgsql-hackers by date: