Re: Skipping schema changes in publication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Skipping schema changes in publication
Date
Msg-id CAHut+Ps0hSNqrjv_jT1AuXxO-CrZue3ixE0jKsxVhtArMrkujQ@mail.gmail.com
Whole thread Raw
In response to Re: Skipping schema changes in publication  (Shlok Kyal <shlok.kyal.oss@gmail.com>)
List pgsql-hackers
On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
...
> > 3.
> > TBH, I was wondering why a new catalog attribute was necessary...
> >
> > Can't you simply re-use the existing attribute "prattrs" attribute.
> > e.g. let's just define negative means exclude.
> >
> > e.g. a value of 1 3 means only the 1st and 3rd columns are published
> > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> > e.g. a value of null mean all columns are published
> >
> > (mixes of negative and positive will not be possible)
> >
>
> Currently I have added a new attribute 'prexcludeattrs' in
> pg_publication_rel table. I used this approach because it will be
> easier for user to get the exclude column list, in code no extra
> processing is required to get the exclude column list.
>
> For an approach to use negative numbers for exclude columns. I see an
> advantage that we do not need to introduce a new column for
> pg_publication_rel. But in code, each time we want to get a column
> list or exclude column list we need an extra processing of 'prattrs'
> columns. Also I don't see any existing catalog table using a negative
> attribute for column list.
>
> Based on above observations, I feel that the current is better.
>
> Please correct me if I missed an advantage for the approach you suggested.
>

OK. Maybe using negative numbers was a bridge too far...

But IMO it is not good to have 2 separate attributes for the lists.
Doing so implies they can coexist, but that is not true. I felt there
are not really 2 "kinds" of columns list anyway -- there is just a
"column list" which defines columns that are either included or
excluded from the publication determined by EXCEPT.

Having  dual lists gets weird/confusing to describe them -- you end up
continually having to refer to the other one to clarify behaviour.

e.g. Does 'prattrs' value NULL mean publish everything? Well, no...
that depends if there is a non null 'prexcludeattrs'
e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well,
no... that depends if there is a non null 'prattrs'

Furthermore, all the code is doubling up referring to "column list"
and "exclude column list"  -- code / docs / comments / error messages.
There are quite a lot of places the patch touches that I thought were
not really needed if you don't have 2 different kinds of column-lists.

To summarise, I felt it would be better to just keep the existing
'prattrs' as the one-and-only column list, but add another BOOLEAN
attribute to flag whether 'prattrs' columns should be included or
excluded.

prattrs;   prattrs_exclude;  Means
--------------------------------------------
1 2 3     f                          only cols 1,2,3 will be published
4 5 6     t                          only cols 4,5,6 will NOT be published
null       f                          all cols are published (flag is ignored)
null       t                          all cols are published (flag is ignored)

> > 5.
> > +  <para>
> > +   Alter publication <structname>mypublication</structname> to add table
> > +   <structname>users</structname> except column
> > +   <structname>security_pin</structname>:
> > +<programlisting>
> > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> >
> > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > <structname> (???).
> >
> > Perhaps, every other example here is wrong too and you just copied
> > them? Anyway, something here looks wrong to me.
> >
> I saw different documents and usage of tags seems not well defined.
> For example for table we are using tags in document
> create_publication.sgml, update.sgml <structname> is used, in document
> table.sgml, advanced.sgml <classname> is used, and in
> logical-replication.sgml <literal>  is used. Similarly for column
> names <structname>, <structfield> or <literal> are used in different
> parts of the document.
>
> I kept the changed tag to <structfield> for the column for this patch.
> Do you have any suggestions?

No, for this patch I think it is best that you just follow nearby code
(as you are already doing). I plan to raise another thread to ask what
are the guidelines for this  sort of markup which is currently used
inconsistently in different places.

//////////

Below are a few more review comments for v13-0003

======
Commit message

1.
Typo /THe/The/

~~~

2.
The new syntax allows specifying excluded column list when creating or
altering a publication. For example:
CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list)
or
ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list)

~

I felt since you say these "For example:" it would be better to give
real examples.
e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)".

~~~

3.
Typo /family of command/family of commands/

======
doc/src/sgml/logical-replication.sgml

4.
I am not sure that it was a good idea to be making a new term called
an "exclude column list"... because in introduces a new concept of
something that sounds like it is a different kind of list, and now you
have to keep referring everywhere to both to "column list" versus
"exclude column list". All the doubling up add more complication I
think.

IMO really there is just a "column list". Whether that list is for
exclusion or not just depends on the presence of EXCEPT. So I felt
maybe all places mentioning "exclude column list" could be rephrased.

======
src/backend/catalog/pg_publication.c

5.
+/*
+ * Returns true if the relation has exluded column list associated with the
+ * publication, false otherwise.
+ *
+ * If a exclude column list is found, the corresponding bitmap is returned
+ * through the cols parameter, if provided. The bitmap is constructed
within the
+ * given memory context (mcxt).
+ */
+

Typo /exluded column/an excluded column/
Typo /exclude column list/excluded column list/

~~~

6.
+/*
+ * pub_exclude_collist_validate
+ * Process and validate the 'excluded columns' list and ensure the columns
+ * are all valid to exclude from publication.  Checks for and raises an
+ * ERROR for any unknown columns, system columns, duplicate columns, or
+ * generated columns.
+ *

Why can't you exclude generated columns?

e.g. Maybe PUBLICATION says publish_generated_columns=stored and there
are 100s of such columns, but the user just wants to exclude one of
them. Why say they cannot do that? Hmm. Perhaps this is being already
handled elsewhere, in which case this comment still seems misleading.

======
src/backend/commands/publicationcmds.c

7.
+ * With REPLICA IDENTITY FULL, no column list and no excluded column
+ * list is allowed.

Really, just "no column list is allowed." same as it said before.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Robins Tharakan
Date:
Subject: leafhopper / snakefly failing to build HEAD - GCC bug
Next
From: shveta malik
Date:
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart