Re: Skipping schema changes in publication - Mailing list pgsql-hackers
From | Shlok Kyal |
---|---|
Subject | Re: Skipping schema changes in publication |
Date | |
Msg-id | CANhcyEU+aPu6iAH2cTA0cDtn3pd6c_njBONCt3FubYZoEEnm8Q@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping schema changes in publication (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Thu, 26 Jun 2025 at 09:06, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Shlok. > > Below are some review comments for v14-0003 > > ====== > 1. GENERAL > > Since the new syntax uses EXCEPT, then, in my opinion, you should try > to use that same term where possible when describing things. I > understand it is hard to do this in text and I agree often it makes > more sense to say "exclude" columns etc, but OTOH in the code there > are lots of places where you could have named vars/params differently: > e.g. 'except_collist' instead of 'exclude_collist' might have been > better. > Fixed the variable names. > ====== > Commit message > > 2. > Column list specifed with EXCEPT is stored in column "prattrs" in table > "pg_publication_rel" and also column "prexcept" is set to "true", to maintain > the column list that user wants to exclude from the publication. > > ~ > > That paragraph could do with some rewording. For example, AFAIK, > "prattrs" is for all column lists -- not just except col-lists, but > the way it is described here sounds different. > > Also, /specifed/specified/ > Reworded the paragraph > ====== > doc/src/sgml/catalogs.sgml > > 3. (52.42. pg_publication_rel) > > <para> > - True if the relation must be excluded > + True if the relation or column list must be excluded. If publication is > + created <literal>FOR ALL TABLES</literal> and it is specified as true, > + the relation should be excluded. Else if it is true the columns in > + <literal>prattrs</literal> should be excluded from being published. > </para></entry> > > I felt this could be expressed more simply without mentioning anything > about FOR ALL TABLES. > > SUGGESTION > True if the column list or relation must be excluded from publication. > If a column list is specified in <literal>prattrs</literal>, then > exclude only those columns. If <literal>prattrs</literal> is NULL, > then exclude the entire relation. > Fixed > ====== > doc/src/sgml/logical-replication.sgml > > 4. (29.5. Column Lists) > > <para> > - Each publication can optionally specify which columns of each table are > - replicated to subscribers. The table on the subscriber side must have at > - least all the columns that are published. If no column list is specified, > - then all columns on the publisher are replicated. > + Each publication can optionally specify which columns of each > table should be > + replicated or excluded from replication. On the subscriber side, the table > + must include at least all the columns that are published. If no column list > + is provided, all columns from the publisher are replicated by default. > See <xref linkend="sql-createpublication"/> for details on the syntax. > </para> > > I felt this patch may have changed too much text. IMO, you only needed > to say "... are replicated or excluded from replication.". The other > changes did not seem necessary. > > ~~~ Fixed > 5. > <para> > - If no column list is specified, any columns added to the table later are > - automatically replicated. This means that having a column list which names > - all columns is not the same as having no column list at all. > + If no column list or a column list with EXCEPT is specified, any columns > + added to the table later are automatically replicated. This means > that having > + a column list which names all columns is not the same as having no > + column list at all. If an column list is specified, any columns added to the > + table later are automatically replicated. > </para> > > 5a. > "This means that having a column list which names all columns is not > the same as having no column list at all." -- That note does not make > sense when you say EXCEPT. I think some rewording is needed here. > Fixed > ~ > > 5b. > "If an column list is specified, any columns added to the table later > are automatically replicated.". > > This made no sense -- some words missing? > This change was done by mistake. Removed it. > ~~~ > > 6. > Generated columns can also be specified in a column list. This allows > generated columns to be published, regardless of the publication parameter > <link linkend="sql-createpublication-params-with-publish-generated-columns"> > - <literal>publish_generated_columns</literal></link>. See > - <xref linkend="logical-replication-gencols"/> for details. > + <literal>publish_generated_columns</literal></link>. Generated columns can > + be included in column list specified with EXCEPT clause if publication > + parameter > + <link linkend="sql-createpublication-params-with-publish-generated-columns"> > + <literal>publish_generated_columns</literal></link> is not set to > + <literal>none</literal>. Specified generated columns will not be published. > + See <xref linkend="logical-replication-gencols"/> for details. > </para> > > I am not so sure about this. It seemed overly strict to me. > > Why can't it simply say: > "Generated columns can also be specified in a column list. This allows > generated columns to be published or excluded, regardless of the > publication parameter..." > > Specifically, I don't know why you need to say: > Generated columns can be included in column list specified with EXCEPT > clause if publication parameter publish_generated_columns is not set > to none. Specified generated columns will not be published. > > IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named > cols. So if param is "stored", then the named cols will be excluded. > OTOH, if param is "none" then all generated cols will be excluded > anyway, so why not just allow the EXCEPT (gencol,gencol2) here as > well, because the result will be the same. > > I have removed this change. And allowed specifying generated columns in EXCEPT column list as well irrespective of value of ‘publish_generated_columns’. > ~~~ > > 7. (29.5.1. Examples) > > <para> > - Create a table <literal>t1</literal> to be used in the following example. > + Create tables <literal>t1</literal>, <literal>t2</literal> to be > used in the > + following example. > > /Create tables t1, t2/Create tables t1 and t2/ > Fixed > ~~~ > > 8. > <para> > Create a publication <literal>p1</literal>. A column list is defined for > - table <literal>t1</literal> to reduce the number of columns that will be > - replicated. Notice that the order of column names in the column list does > - not matter. > + table <literal>t1</literal> and a column list is defined for table > + <literal>t2</literal> with EXCEPT clause to reduce the number of > columns that will be > + replicated. Notice that the order of column names in the column > lists does not matter. > > BEFORE > A column list is defined for table t1 and a column list is defined for > table t2... > > SUGGESTION (added comma, etc.) > A column list is defined for table t1, and another column list is > defined for table t2... > Fixed > ~~~ > > 9. > The final example still says: > "Only data from the column list of publication p1 is replicated." > > That doesn't seem quite appropriate now that you also have an EXCEPT > column list. > > SUGGESTION: > Only data specified by the column lists of publication p1 is replicated. > Fixed > ====== > doc/src/sgml/ref/create_publication.sgml > > 10. > + <para> > + When a column list is specified with EXCEPT, the named columns are not > + replicated. Specifying a column list has no effect on > + <literal>TRUNCATE</literal> commands. > + </para> > > I felt that to be clearer the preceding paragraph should be changed as follows: > > /When a column list is specified, only the named columns are > replicated./When a column list without EXCEPT is specified, only the > named columns are replicated./ > Fixed > ~~~ > > 11. CREATE PUBLICATION (NOTES section) > > 11a. > The NOTES talk about replica identity columns -- should you mention EXCEPT here? > Added notes for EXCEPT > ~ > > 11b. > The NOTES talk about generated columns -- should you mention EXCEPT here? > I felt it is not needed. > ====== > src/backend/catalog/pg_publication.c > > 12. check_and_fetch_column_list > > + if (!isnull) > + except = DatumGetBool(cfdatum); > + > + *except_columns = except && !pub->alltables; > > AFAICT, you can Assert(!pub->alltables) because you already checked > that earlier up front. > So you don't need 'except' var either. Just assign *except_cols up > front and then overwrite it later if true. > > SUGGESTION: > > *except_cols = false; > > if (pub->alltables) > return false; > ... > if (!isnull) > *except_cols = DatumGetBool(cfdatum); > Fixed > ~~~ > > 13. publication_add_relation > > /* Validate and translate column names into a Bitmapset of attnums. */ > - attnums = pub_collist_validate(pri->relation, pri->columns); > + attnums = pub_collist_validate(pri->relation, pri->columns, > + pri->except && !pub->alltables, > + pub->pubgencols_type); > > > I am wondering why we are even calling a function to validate column > lists if pub->alltables was true. AFAIK, that combination of > column-lists and FOR ALL TABLES is not even possible, so the code > seems strange. > Fixed > ~~~ > > 14. pub_exclude_collist_validate > . > + /* > + * Check if column list specified with EXCEPT have any stored > + * generated column and 'publish_generated_columns' is not set to > + * 'stored'. > + */ > + if (except_columns && > + TupleDescAttr(tupdesc, attnum - 1)->attgenerated == > ATTRIBUTE_GENERATED_STORED && > + pubgencols_type != PUBLISH_GENCOLS_STORED) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > + errmsg("cannot use stored generated column \"%s\" in publication > column list specified with EXCEPT when \"%s\" set to \"%s\"", > + colname, "publish_generated_columns", "stored")); > > As mentioned in the above DOCS comments, I was having doubts about why > we have this error. > > If the parameter says "none", then generated columns will not be > replicated, so why should we care if the user also says > EXCEPT(gencol1,gencol2). Either way, the result will be the same; the > generated column will not be published. > Removed this restriction. > ~~~ > > 15. GetRelationPublications > > { > HeapTuple tup = &pubrellist->members[i]->tuple; > Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid; > + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid)); > + bool is_table_excluded = ((Form_pg_publication) > GETSTRUCT(pubtup))->puballtables && > + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept; > > - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept) > + if (except_flag == is_table_excluded) > result = lappend_oid(result, pubid); > + > + ReleaseS > > > I'm not 100% sure you need the additional 'pubtup'... Can't you just > look at the "prattrs" field to see if a column-list was specified? If > "prattrs" is null and "prexcept" is true, isn't that the same > combination as what you are looking for here? > Yes, we can use this combination as well. Fixed it in latest patch. > ~~~ > > 16. pg_get_publication_tables > > + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple, > + Anum_pg_publication_rel_prattrs, > + &(nulls[2])); > + > + /* if column list is specified with EXCEPT */ > + if (!pub->alltables && except) > + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL); > + else > + values[2] = columnsDatum; > > 16a. > Something seems fishy here. Isn't there a pathway where you missed > assigning value[2] to anything? > Modified this change. > ~ > > 16b. > Also, I feel there should be some other boolean variable used here > instead of checking bot (!pub->alltables && except) in multiple > places. > Fixed > > ====== > src/backend/replication/pgoutput/pgoutput.c > > 17. RelationSyncEntry > + > + /* Indicate if no column is included in the publication */ > + bool no_cols_published; > > Maybe this can have a more explanatory comment to explain why it is needed? > Fixed > ~~~ > > 18. check_and_init_gencol > > + bool found = false; > + bool except_columns = false; > + > + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, > + NULL, &except_columns); > + > /* > * The column list takes precedence over the > * 'publish_generated_columns' parameter. Those will be checked later, > - * see pgoutput_column_list_init. > + * see pgoutput_column_list_init. But when a column list is specified > + * with EXCEPT, it should be checked. > */ > - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) > + if (found && !except_columns) > continue; > > The variable 'found' seems a poor name; how about 'has_column_list' or similar? > Fixed > ~~~ > > 19. pgoutput_change > > + /* > + * If all columns of a table is present in column list specified with > + * EXCEPT, skip publishing the changes. > + */ > + if (relentry->no_cols_published) > + return; > > /is present/are present/ > fixed > ====== > src/bin/pg_dump/pg_dump.c > > 20. getPublicationTables > > + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) > pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL; > + else > + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL; > > pubrinfo[j].dobj.catId.tableoid = > atooid(PQgetvalue(res, i, i_tableoid)); > @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo > tblinfo[], int numTables) > pubrinfo[j].pubrelqual = NULL; > else > pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual)); > + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0); > > > Why not assign pubrinfo[j].pubexcept earlier so you don't have to > repeat the strcmp? > Fixed > ~~~ > > 21. > - if (strcmp(prexcept, "t") == 0) > + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) > simple_ptr_list_append(&exceptinfo, &pubrinfo[j]); > > Why not assign pubrinfo[j].pubexcept earlier so you don't have to > repeat the strcmp? Same also for the PQgetisnull(res, i, > i_prattrs))... > Fixed > ~~~ > > 22. dumpPublicationTable > > if (pubrinfo->pubrattrs) > - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > + { > + if (pubrinfo->pubexcept) > + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs); > + else > + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > + } > > SUGGESTION > { > if (pubrinfo->pubexcept) > appendPQExpBuffer(query, " EXCEPT"); > > appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > } Fixed I have addressed the comments shared by you and shared the updated v15 patch set here. Thanks and Regards, Shlok Kyal
Attachment
pgsql-hackers by date: