Re: Skipping schema changes in publication - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: Skipping schema changes in publication
Date
Msg-id CANhcyEUtYV-9ujtxLasnxN_peT+3LuZjcRx1xUECh1CCmANB8w@mail.gmail.com
Whole thread Raw
In response to Re: Skipping schema changes in publication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Skipping schema changes in publication
List pgsql-hackers
On Mon, 21 Jul 2025 at 12:17, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for patch v16-0003.
>
> ======
> Commit message
>
> 1.
> The column "prexcept" of system catalog "pg_publication_rel" is set to
> "true" when publication is created with EXCEPT table or EXCEPT column
> list. If column "prattrs" of system catalog "pg_publication_rel" is also
> set or column "puballtables" of system catalog "pg_publication" is
> "false", it indicates the column list is specified with EXCEPT clause
> and columns in "prattrs" are excluded from being published.
>
> ~
>
> Somehow, this seems to contain too much information, making it a bit
> confusing. Can't you chop this down to something like below?
>
> SUGESTION
> When column "prexcept" of system catalog "pg_publication_rel" is set
> to "true", and column "prattrs" of system catalog "pg_publication_rel"
> is not NULL, that means the publication was created with "EXCEPT
> (column-list)", and the columns in "prattrs" will be excluded from
> being published.
>
Modified the commit message as per suggestion.

> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link>. Generated
> columns can be
> +   specified in a column list using the <literal>EXCEPT</literal> clause. This
> +   excludes the specified generated columns from being published, regardless of
> +   the <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> setting. However, for
> +   generated columns that are not listed in the <literal>EXCEPT</literal>
> +   clause, whether they are published or not still depends on the value of
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
>     <literal>publish_generated_columns</literal></link>. See
>     <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> ~~
>
> For this part:
>
> "Generated columns can be specified in a column list using the
> <literal>EXCEPT</literal> clause. This excludes the specified
> generated columns from being published, regardless of..."
>
> I think the whole paragraph already said "Generated columns can also
> be specified in a column list", so you don't need to repeat it.
> Instead, maybe say something like below.
>
> SUGGESTION
> Specifying generated columns in a column list using the
> <literal>EXCEPT</literal> clause excludes those columns from being
> published, regardless of...
>
> ~~~
>
Modified

> 3.
> -                               Publication p1
> -  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -----------+------------+---------+---------+---------+-----------+----------
> - postgres | f          | t       | t       | t       | t         | f
> +                                        Publication p1
> + Owner  | All tables | Inserts | Updates | Deletes | Truncates |
> Generated columns | Via root
> +--------+------------+---------+---------+---------+-----------+-------------------+----------
> + ubuntu | f          | t       | t       | t       | t         | none
>              | f
>  Tables:
>      "public.t1" (id, a, b, d)
> +    "public.t2" EXCEPT (a, d)
>  </programlisting></para>
>
>
> I noticed the Owner changed from "postgres" to "ubuntu". Do you think
> it is better to keep this as "postgres" for the example?
I agree that it is better to keep "postgres". I have reverted back to
the use "postgres"..

>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 4.
> The tables added to a publication that publishes UPDATE and/or DELETE
> operations must have REPLICA IDENTITY defined. Otherwise those
> operations will be disallowed on those tables.
>
> In order for UPDATE or DELETE operations to work, all the REPLICA
> IDENTITY columns must be published. So, any column list must name all
> REPLICA IDENTITY columns, and any EXCEPT column list must not name any
> REPLICA IDENTITY columns.
>
> A row filter expression (i.e., the WHERE clause) must contain only
> columns that are covered by the REPLICA IDENTITY, in order for UPDATE
> and DELETE operations to be published. For publication of INSERT
> operations, any column may be used in the WHERE expression. The row
> filter allows simple expressions that don't have user-defined
> functions, user-defined operators, user-defined types, user-defined
> collations, non-immutable built-in functions, or references to system
> columns.
>
> The generated columns that are part of the column list specified with
> the EXCEPT clause are not published, regardless of the
> publish_generated_columns option. However, generated columns that are
> not part of the column list specified with the EXCEPT clause are
> published according to the value of the publish_generated_columns
> option. See Section 29.6 for details.
>
> The generated columns that are part of REPLICA IDENTITY must be
> published explicitly either by listing them in the column list or by
> enabling the publish_generated_columns option, in order for UPDATE and
> DELETE operations to be published.
>
> ~~
>
> Notice all those 5 paragraphs (above) are talking about REPLICA
> IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be
> moved to last, to keep all the REPLICA IDENTITY stuff together.
>
Fixed

> ======
> src/backend/catalog/pg_publication.c
>
> 5. pub_form_cols_map
>
>   * Returns a bitmap representing the columns of the specified table.
>   *
>   * Generated columns are included if include_gencols_type is
> - * PUBLISH_GENCOLS_STORED.
> + * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from
> + * the column list.
>   */
>  Bitmapset *
> -pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type)
> +pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type,
> +   Bitmapset *except_cols)
>
> Forgot to add the underscore in the function comment.
>
> /exceptcols/except_cols/
>
Fixed

> ~~~
>
> 6. pg_get_publication_tables
>
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and not
> + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that
> + * prattrs contains columns to be excluded for replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum) && !nulls[2])
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
>
> But, you cannot have EXCEPT for null column list, so shouldn't the
> !nulls[2] check be done to also guard the SysCacheGetAttr call?
>
Fixed

> ======
> src/backend/parser/gram.y
>
> 7.
>
> Shlok wrote [1-reply #11]
> The main reason I used a separate 'opt_except_column_list' is because
> 'opt_column_list' can also be NULL. But the column list specified with
> EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
> it cannot be null.
>
> ~
>
> Yeah, but IMO that leads to excessive duplicated code. I think the
> code can perhaps be a lot simpler if the grammar is written more like
> the synopsis:
>
> e.g. TABLE name opt_EXCEPT opt_column_list
>
> where - opt_EXCEPT is null, and opt_column_list is null... means no col list
> where - opt_EXCEPT is null, and opt_column_list is not null... means
> normal col list
> where - opt_EXCEPT is not null, and opt_column_list not null... means
> EXCEPT col list
> where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR
>
> So code it something like this (just adding opt_EXCEPT to the existing
> productions)
>
> %type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT
> ...
> opt_EXCEPT:
> EXCEPT { $$ = true; }
> | /*EMPTY*/ { $$ = false; }
> ;
> ...
> TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause
> {
>   $$ = makeNode(PublicationObjSpec);
>   $$->pubobjtype = PUBLICATIONOBJ_TABLE;
>   $$->pubtable = makeNode(PublicationTable);
>   $$->pubtable->relation = $2;
>   $$->pubtable->except = $3;
>   $$->pubtable->columns = $4;
>   if ($3 && !$4)
>     ereport(ERROR,
>       (errcode(ERRCODE_SYNTAX_ERROR),
>       errmsg("EXCEPT without column list"),
>       parser_errposition(@3)));
>   $$->pubtable->whereClause = $5;
>   $$->location = @1;
> }
>
> etc.
>
I have modified it. I have created a function 'check_except_collist'
to throw error, to avoid duplication code for error message.

> ======
> src/bin/psql/describe.c
>
> 8.
>   if (!PQgetisnull(res, i, 3))
> + {
> + if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0)
> + appendPQExpBuffer(buf, " EXCEPT");
>   appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3));
> + }
>
> This growing list of columns makes it hard to understand this function
> without looking back at the caller all the time. Maybe you can add a
> function comment that at least explains what those attributes 1,2,3,4
> represent?
>
Added a comment

> ======
> src/bin/psql/tab-complete.in.c
>
> 9.
> + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET",
> "TABLE", MatchAny))
> + COMPLETE_WITH("EXCEPT");
>
> Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

> ~~~
>
> 10.
>   else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE",
> MatchAny) && !ends_with(prev_wd, ','))
> - COMPLETE_WITH("WHERE (", "WITH (");
> + COMPLETE_WITH("EXCEPT", "WHERE (", "WITH (");
>
> Ditto. Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

>
> ======
> src/test/regress/expected/publication.out
>
> 11.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +ERROR:  index "pub_test_except1_a_idx" for table "pub_test_except1"
> does not exist
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +ERROR:  cannot update table "pub_test_except1"
> +DETAIL:  Column list used by the publication does not cover the
> replica identity.
> +DROP INDEX pub_test_except1_ac_idx;
>
>
> What's happening here? I'm not sure these are the kind of errors you
> were trying to cause.
>
Yes, it is not the error I was trying to cause. I have modified it.

> ======
> src/test/regress/sql/publication.sql
>
> 12.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 13.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +DROP INDEX pub_test_except1_ac_idx;
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 14.
> +-- Verify that so long as no clash between RI cols and the EXCEPT
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +
>
> That comment doesn't make sense. Missing words?
>
Fixed

> ======
> .../t/036_rep_changes_except_table.pl
>
> 15.
> (I haven't reviewed this file in detail yet, but here is a general comment)
>
> I know this patch currently lives in the same thread as all the EXCEPT
> TABLE stuff, but that seems just happenstance to me. IMO, this is a
> separate enhancement that just shares the keyword EXCEPT. So, I felt
> it should have quite separate tests too.
>
> e.g. How about: 037_rep_changes_except_collist.pl
>
Modified

> ======
> [1] https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Thanks,
Shlok Kyal

Attachment

pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: display hot standby state in psql prompt
Next
From: Shlok Kyal
Date:
Subject: Re: Skipping schema changes in publication