Re: Skipping schema changes in publication - Mailing list pgsql-hackers
From | Shlok Kyal |
---|---|
Subject | Re: Skipping schema changes in publication |
Date | |
Msg-id | CANhcyEUtYV-9ujtxLasnxN_peT+3LuZjcRx1xUECh1CCmANB8w@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping schema changes in publication (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Skipping schema changes in publication
|
List | pgsql-hackers |
On Mon, 21 Jul 2025 at 12:17, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Shlok. > > Some review comments for patch v16-0003. > > ====== > Commit message > > 1. > The column "prexcept" of system catalog "pg_publication_rel" is set to > "true" when publication is created with EXCEPT table or EXCEPT column > list. If column "prattrs" of system catalog "pg_publication_rel" is also > set or column "puballtables" of system catalog "pg_publication" is > "false", it indicates the column list is specified with EXCEPT clause > and columns in "prattrs" are excluded from being published. > > ~ > > Somehow, this seems to contain too much information, making it a bit > confusing. Can't you chop this down to something like below? > > SUGESTION > When column "prexcept" of system catalog "pg_publication_rel" is set > to "true", and column "prattrs" of system catalog "pg_publication_rel" > is not NULL, that means the publication was created with "EXCEPT > (column-list)", and the columns in "prattrs" will be excluded from > being published. > Modified the commit message as per suggestion. > ====== > doc/src/sgml/logical-replication.sgml > > 2. > Generated columns can also be specified in a column list. This allows > generated columns to be published, regardless of the publication parameter > <link linkend="sql-createpublication-params-with-publish-generated-columns"> > + <literal>publish_generated_columns</literal></link>. Generated > columns can be > + specified in a column list using the <literal>EXCEPT</literal> clause. This > + excludes the specified generated columns from being published, regardless of > + the <link linkend="sql-createpublication-params-with-publish-generated-columns"> > + <literal>publish_generated_columns</literal></link> setting. However, for > + generated columns that are not listed in the <literal>EXCEPT</literal> > + clause, whether they are published or not still depends on the value of > + <link linkend="sql-createpublication-params-with-publish-generated-columns"> > <literal>publish_generated_columns</literal></link>. See > <xref linkend="logical-replication-gencols"/> for details. > </para> > > ~~ > > For this part: > > "Generated columns can be specified in a column list using the > <literal>EXCEPT</literal> clause. This excludes the specified > generated columns from being published, regardless of..." > > I think the whole paragraph already said "Generated columns can also > be specified in a column list", so you don't need to repeat it. > Instead, maybe say something like below. > > SUGGESTION > Specifying generated columns in a column list using the > <literal>EXCEPT</literal> clause excludes those columns from being > published, regardless of... > > ~~~ > Modified > 3. > - Publication p1 > - Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root > -----------+------------+---------+---------+---------+-----------+---------- > - postgres | f | t | t | t | t | f > + Publication p1 > + Owner | All tables | Inserts | Updates | Deletes | Truncates | > Generated columns | Via root > +--------+------------+---------+---------+---------+-----------+-------------------+---------- > + ubuntu | f | t | t | t | t | none > | f > Tables: > "public.t1" (id, a, b, d) > + "public.t2" EXCEPT (a, d) > </programlisting></para> > > > I noticed the Owner changed from "postgres" to "ubuntu". Do you think > it is better to keep this as "postgres" for the example? I agree that it is better to keep "postgres". I have reverted back to the use "postgres".. > > ====== > doc/src/sgml/ref/create_publication.sgml > > 4. > The tables added to a publication that publishes UPDATE and/or DELETE > operations must have REPLICA IDENTITY defined. Otherwise those > operations will be disallowed on those tables. > > In order for UPDATE or DELETE operations to work, all the REPLICA > IDENTITY columns must be published. So, any column list must name all > REPLICA IDENTITY columns, and any EXCEPT column list must not name any > REPLICA IDENTITY columns. > > A row filter expression (i.e., the WHERE clause) must contain only > columns that are covered by the REPLICA IDENTITY, in order for UPDATE > and DELETE operations to be published. For publication of INSERT > operations, any column may be used in the WHERE expression. The row > filter allows simple expressions that don't have user-defined > functions, user-defined operators, user-defined types, user-defined > collations, non-immutable built-in functions, or references to system > columns. > > The generated columns that are part of the column list specified with > the EXCEPT clause are not published, regardless of the > publish_generated_columns option. However, generated columns that are > not part of the column list specified with the EXCEPT clause are > published according to the value of the publish_generated_columns > option. See Section 29.6 for details. > > The generated columns that are part of REPLICA IDENTITY must be > published explicitly either by listing them in the column list or by > enabling the publish_generated_columns option, in order for UPDATE and > DELETE operations to be published. > > ~~ > > Notice all those 5 paragraphs (above) are talking about REPLICA > IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be > moved to last, to keep all the REPLICA IDENTITY stuff together. > Fixed > ====== > src/backend/catalog/pg_publication.c > > 5. pub_form_cols_map > > * Returns a bitmap representing the columns of the specified table. > * > * Generated columns are included if include_gencols_type is > - * PUBLISH_GENCOLS_STORED. > + * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from > + * the column list. > */ > Bitmapset * > -pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type) > +pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type, > + Bitmapset *except_cols) > > Forgot to add the underscore in the function comment. > > /exceptcols/except_cols/ > Fixed > ~~~ > > 6. pg_get_publication_tables > > + > + /* > + * We fetch pubtuple if publication is not FOR ALL TABLES and not > + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that > + * prattrs contains columns to be excluded for replication. > + */ > + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple, > + Anum_pg_publication_rel_prexcept, > + &isnull); > + > + if (!isnull && DatumGetBool(exceptDatum) && !nulls[2]) > + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL); > > But, you cannot have EXCEPT for null column list, so shouldn't the > !nulls[2] check be done to also guard the SysCacheGetAttr call? > Fixed > ====== > src/backend/parser/gram.y > > 7. > > Shlok wrote [1-reply #11] > The main reason I used a separate 'opt_except_column_list' is because > 'opt_column_list' can also be NULL. But the column list specified with > EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that > it cannot be null. > > ~ > > Yeah, but IMO that leads to excessive duplicated code. I think the > code can perhaps be a lot simpler if the grammar is written more like > the synopsis: > > e.g. TABLE name opt_EXCEPT opt_column_list > > where - opt_EXCEPT is null, and opt_column_list is null... means no col list > where - opt_EXCEPT is null, and opt_column_list is not null... means > normal col list > where - opt_EXCEPT is not null, and opt_column_list not null... means > EXCEPT col list > where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR > > So code it something like this (just adding opt_EXCEPT to the existing > productions) > > %type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT > ... > opt_EXCEPT: > EXCEPT { $$ = true; } > | /*EMPTY*/ { $$ = false; } > ; > ... > TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause > { > $$ = makeNode(PublicationObjSpec); > $$->pubobjtype = PUBLICATIONOBJ_TABLE; > $$->pubtable = makeNode(PublicationTable); > $$->pubtable->relation = $2; > $$->pubtable->except = $3; > $$->pubtable->columns = $4; > if ($3 && !$4) > ereport(ERROR, > (errcode(ERRCODE_SYNTAX_ERROR), > errmsg("EXCEPT without column list"), > parser_errposition(@3))); > $$->pubtable->whereClause = $5; > $$->location = @1; > } > > etc. > I have modified it. I have created a function 'check_except_collist' to throw error, to avoid duplication code for error message. > ====== > src/bin/psql/describe.c > > 8. > if (!PQgetisnull(res, i, 3)) > + { > + if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0) > + appendPQExpBuffer(buf, " EXCEPT"); > appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3)); > + } > > This growing list of columns makes it hard to understand this function > without looking back at the caller all the time. Maybe you can add a > function comment that at least explains what those attributes 1,2,3,4 > represent? > Added a comment > ====== > src/bin/psql/tab-complete.in.c > > 9. > + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET", > "TABLE", MatchAny)) > + COMPLETE_WITH("EXCEPT"); > > Since it is not allowed to have an EXCEPT with no column list, > shouldn't this say "EXCEPT ("? > Fixed > ~~~ > > 10. > else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", > MatchAny) && !ends_with(prev_wd, ',')) > - COMPLETE_WITH("WHERE (", "WITH ("); > + COMPLETE_WITH("EXCEPT", "WHERE (", "WITH ("); > > Ditto. Since it is not allowed to have an EXCEPT with no column list, > shouldn't this say "EXCEPT ("? > Fixed > > ====== > src/test/regress/expected/publication.out > > 11. > +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX) > +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c); > +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX > pub_test_except1_a_idx; > +ERROR: index "pub_test_except1_a_idx" for table "pub_test_except1" > does not exist > +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; > +ERROR: cannot update table "pub_test_except1" > +DETAIL: Column list used by the publication does not cover the > replica identity. > +DROP INDEX pub_test_except1_ac_idx; > > > What's happening here? I'm not sure these are the kind of errors you > were trying to cause. > Yes, it is not the error I was trying to cause. I have modified it. > ====== > src/test/regress/sql/publication.sql > > 12. > +-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL) > +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL; > +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; > > > SUGGESTION. Change that comment to: > Verify fails - EXCEPT col-list cannot... > Fixed > ~~~ > > 13. > +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX) > +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c); > +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX > pub_test_except1_a_idx; > +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; > +DROP INDEX pub_test_except1_ac_idx; > > SUGGESTION. Change that comment to: > Verify fails - EXCEPT col-list cannot... > Fixed > ~~~ > > 14. > +-- Verify that so long as no clash between RI cols and the EXCEPT > +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a); > +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX > pub_test_except1_a_idx; > +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; > + > > That comment doesn't make sense. Missing words? > Fixed > ====== > .../t/036_rep_changes_except_table.pl > > 15. > (I haven't reviewed this file in detail yet, but here is a general comment) > > I know this patch currently lives in the same thread as all the EXCEPT > TABLE stuff, but that seems just happenstance to me. IMO, this is a > separate enhancement that just shares the keyword EXCEPT. So, I felt > it should have quite separate tests too. > > e.g. How about: 037_rep_changes_except_collist.pl > Modified > ====== > [1] https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com Thanks, Shlok Kyal
Attachment
pgsql-hackers by date: