Re: Skipping schema changes in publication - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: Skipping schema changes in publication
Date
Msg-id CANhcyEU+aPu6iAH2cTA0cDtn3pd6c_njBONCt3FubYZoEEnm8Q@mail.gmail.com
Whole thread Raw
In response to Re: Skipping schema changes in publication  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Thu, 26 Jun 2025 at 09:06, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Below are some review comments for v14-0003
>
> ======
> 1. GENERAL
>
> Since the new syntax uses EXCEPT, then, in my opinion, you should try
> to use that same term where possible when describing things. I
> understand it is hard to do this in text and I agree often it makes
> more sense to say "exclude" columns etc, but OTOH in the code there
> are lots of places where you could have named vars/params differently:
> e.g. 'except_collist' instead of 'exclude_collist' might have been
> better.
>
Fixed the variable names.

> ======
> Commit message
>
> 2.
> Column list specifed with EXCEPT is stored in column "prattrs" in table
> "pg_publication_rel" and also column "prexcept" is set to "true", to maintain
> the column list that user wants to exclude from the publication.
>
> ~
>
> That paragraph could do with some rewording. For example, AFAIK,
> "prattrs" is for all column lists -- not just except col-lists, but
> the way it is described here sounds different.
>
> Also, /specifed/specified/
>
Reworded the paragraph

> ======
> doc/src/sgml/catalogs.sgml
>
> 3. (52.42. pg_publication_rel)
>
>        <para>
> -       True if the relation must be excluded
> +       True if the relation or column list must be excluded. If publication is
> +       created <literal>FOR ALL TABLES</literal> and it is specified as true,
> +       the relation should be excluded. Else if it is true the columns in
> +       <literal>prattrs</literal> should be excluded from being published.
>        </para></entry>
>
> I felt this could be expressed more simply without mentioning anything
> about FOR ALL TABLES.
>
> SUGGESTION
> True if the column list or relation must be excluded from publication.
> If a column list is specified in <literal>prattrs</literal>, then
> exclude only those columns. If <literal>prattrs</literal> is NULL,
> then exclude the entire relation.
>
Fixed

> ======
> doc/src/sgml/logical-replication.sgml
>
> 4. (29.5. Column Lists)
>
>    <para>
> -   Each publication can optionally specify which columns of each table are
> -   replicated to subscribers. The table on the subscriber side must have at
> -   least all the columns that are published. If no column list is specified,
> -   then all columns on the publisher are replicated.
> +   Each publication can optionally specify which columns of each
> table should be
> +   replicated or excluded from replication. On the subscriber side, the table
> +   must include at least all the columns that are published. If no column list
> +   is provided, all columns from the publisher are replicated by default.
>     See <xref linkend="sql-createpublication"/> for details on the syntax.
>    </para>
>
> I felt this patch may have changed too much text. IMO, you only needed
> to say "... are replicated or excluded from replication.". The other
> changes did not seem necessary.
>
> ~~~
Fixed

> 5.
>    <para>
> -   If no column list is specified, any columns added to the table later are
> -   automatically replicated. This means that having a column list which names
> -   all columns is not the same as having no column list at all.
> +   If no column list or a column list with EXCEPT is specified, any columns
> +   added to the table later are automatically replicated. This means
> that having
> +   a column list which names all columns is not the same as having no
> +   column list at all. If an column list is specified, any columns added to the
> +   table later are automatically replicated.
>    </para>
>
> 5a.
> "This means that having a column list which names all columns is not
> the same as having no column list at all." -- That note does not make
> sense when you say EXCEPT. I think some rewording is needed here.
>
Fixed

> ~
>
> 5b.
> "If an column list is specified, any columns added to the table later
> are automatically replicated.".
>
> This made no sense -- some words missing?
>
This change was done by mistake. Removed it.

> ~~~
>
> 6.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> -   <literal>publish_generated_columns</literal></link>. See
> -   <xref linkend="logical-replication-gencols"/> for details.
> +   <literal>publish_generated_columns</literal></link>. Generated columns can
> +   be included in column list specified with EXCEPT clause if publication
> +   parameter
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> is not set to
> +   <literal>none</literal>. Specified generated columns will not be published.
> +   See <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> I am not so sure about this. It seemed overly strict to me.
>
> Why can't it simply say:
> "Generated columns can also be specified in a column list. This allows
> generated columns to be published or excluded, regardless of the
> publication parameter..."
>
> Specifically, I don't know why you need to say:
> Generated columns can be included in column list specified with EXCEPT
> clause if publication parameter publish_generated_columns is not set
> to none. Specified generated columns will not be published.
>
> IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named
> cols. So if param is "stored", then the named cols will be excluded.
> OTOH, if param is "none" then all generated cols will be excluded
> anyway, so why not just allow the EXCEPT (gencol,gencol2) here as
> well, because the result will be the same.
>
>
I have removed this change. And allowed specifying generated columns
in EXCEPT column list as well irrespective of value of
‘publish_generated_columns’.

> ~~~
>
> 7. (29.5.1. Examples)
>
>     <para>
> -    Create a table <literal>t1</literal> to be used in the following example.
> +    Create tables <literal>t1</literal>, <literal>t2</literal> to be
> used in the
> +    following example.
>
> /Create tables t1, t2/Create tables t1 and t2/
>
Fixed

> ~~~
>
> 8.
>     <para>
>      Create a publication <literal>p1</literal>. A column list is defined for
> -    table <literal>t1</literal> to reduce the number of columns that will be
> -    replicated. Notice that the order of column names in the column list does
> -    not matter.
> +    table <literal>t1</literal> and a column list is defined for table
> +    <literal>t2</literal> with EXCEPT clause to reduce the number of
> columns that will be
> +    replicated. Notice that the order of column names in the column
> lists does not matter.
>
> BEFORE
> A column list is defined for table t1 and a column list is defined for
> table t2...
>
> SUGGESTION (added comma, etc.)
> A column list is defined for table t1, and another column list is
> defined for table t2...
>
Fixed

> ~~~
>
> 9.
> The final example still says:
> "Only data from the column list of publication p1 is replicated."
>
> That doesn't seem quite appropriate now that you also have an EXCEPT
> column list.
>
> SUGGESTION:
> Only data specified by the column lists of publication p1 is replicated.
>
Fixed

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 10.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands.
> +     </para>
>
> I felt that to be clearer the preceding paragraph should be changed as follows:
>
> /When a column list is specified, only the named columns are
> replicated./When a column list without EXCEPT is specified, only the
> named columns are replicated./
>
Fixed

> ~~~
>
> 11. CREATE PUBLICATION (NOTES section)
>
> 11a.
> The NOTES talk about replica identity columns -- should you mention EXCEPT here?
>
Added notes for EXCEPT

> ~
>
> 11b.
> The NOTES talk about generated columns -- should you mention EXCEPT here?
>
I felt it is not needed.

> ======
> src/backend/catalog/pg_publication.c
>
> 12. check_and_fetch_column_list
>
> + if (!isnull)
> + except = DatumGetBool(cfdatum);
> +
> + *except_columns = except && !pub->alltables;
>
> AFAICT, you can Assert(!pub->alltables) because you already checked
> that earlier up front.
> So you don't need 'except' var either. Just assign *except_cols up
> front and then overwrite it later if true.
>
> SUGGESTION:
>
> *except_cols = false;
>
> if (pub->alltables)
>   return false;
> ...
> if (!isnull)
>  *except_cols = DatumGetBool(cfdatum);
>
Fixed

> ~~~
>
> 13. publication_add_relation
>
>   /* Validate and translate column names into a Bitmapset of attnums. */
> - attnums = pub_collist_validate(pri->relation, pri->columns);
> + attnums = pub_collist_validate(pri->relation, pri->columns,
> +    pri->except && !pub->alltables,
> +    pub->pubgencols_type);
>
>
> I am wondering why we are even calling a function to validate column
> lists if pub->alltables was true. AFAIK, that combination of
> column-lists and FOR ALL TABLES is not even possible, so the code
> seems strange.
>
Fixed

> ~~~
>
> 14. pub_exclude_collist_validate
> .
> + /*
> + * Check if column list specified with EXCEPT have any stored
> + * generated column and 'publish_generated_columns' is not set to
> + * 'stored'.
> + */
> + if (except_columns &&
> + TupleDescAttr(tupdesc, attnum - 1)->attgenerated ==
> ATTRIBUTE_GENERATED_STORED &&
> + pubgencols_type != PUBLISH_GENCOLS_STORED)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> + errmsg("cannot use stored generated column \"%s\" in publication
> column list specified with EXCEPT when \"%s\" set to \"%s\"",
> +    colname, "publish_generated_columns", "stored"));
>
> As mentioned in the above DOCS comments, I was having doubts about why
> we have this error.
>
> If the parameter says "none", then generated columns will not be
> replicated, so why should we care if the user also says
> EXCEPT(gencol1,gencol2). Either way, the result will be the same; the
> generated column will not be published.
>
Removed this restriction.

> ~~~
>
> 15. GetRelationPublications
>
>   {
>   HeapTuple tup = &pubrellist->members[i]->tuple;
>   Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
> + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
> + bool is_table_excluded = ((Form_pg_publication)
> GETSTRUCT(pubtup))->puballtables &&
> + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept;
>
> - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
> + if (except_flag == is_table_excluded)
>   result = lappend_oid(result, pubid);
> +
> + ReleaseS
>
>
> I'm not 100% sure you need the additional 'pubtup'... Can't you just
> look at the "prattrs" field to see if a column-list was specified? If
> "prattrs" is null and "prexcept" is true, isn't that the same
> combination as what you are looking for here?
>
Yes, we can use this combination as well. Fixed it in latest patch.

> ~~~
>
> 16. pg_get_publication_tables
>
> + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +    Anum_pg_publication_rel_prattrs,
> +    &(nulls[2]));
> +
> + /* if column list is specified with EXCEPT */
> + if (!pub->alltables && except)
> + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL);
> + else
> + values[2] = columnsDatum;
>
> 16a.
> Something seems fishy here. Isn't there a pathway where you missed
> assigning value[2] to anything?
>
Modified this change.

> ~
>
> 16b.
> Also, I feel there should be some other boolean variable used here
> instead of checking bot (!pub->alltables && except) in multiple
> places.
>
Fixed
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 17. RelationSyncEntry
> +
> + /* Indicate if no column is included in the publication */
> + bool no_cols_published;
>
> Maybe this can have a more explanatory comment to explain why it is needed?
>
Fixed

> ~~~
>
> 18. check_and_init_gencol
>
> + bool found = false;
> + bool except_columns = false;
> +
> + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL,
> + NULL, &except_columns);
> +
>   /*
>   * The column list takes precedence over the
>   * 'publish_generated_columns' parameter. Those will be checked later,
> - * see pgoutput_column_list_init.
> + * see pgoutput_column_list_init. But when a column list is specified
> + * with EXCEPT, it should be checked.
>   */
> - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL))
> + if (found && !except_columns)
>   continue;
>
> The variable 'found' seems a poor name; how about 'has_column_list' or similar?
>
Fixed

> ~~~
>
> 19. pgoutput_change
>
> + /*
> + * If all columns of a table is present in column list specified with
> + * EXCEPT, skip publishing the changes.
> + */
> + if (relentry->no_cols_published)
> + return;
>
> /is present/are present/
>
fixed

> ======
> src/bin/pg_dump/pg_dump.c
>
> 20. getPublicationTables
>
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
>
>   pubrinfo[j].dobj.catId.tableoid =
>   atooid(PQgetvalue(res, i, i_tableoid));
> @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo
> tblinfo[], int numTables)
>   pubrinfo[j].pubrelqual = NULL;
>   else
>   pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
> + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0);
>
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp?
>
Fixed

> ~~~
>
> 21.
> - if (strcmp(prexcept, "t") == 0)
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp? Same also for the PQgetisnull(res, i,
> i_prattrs))...
>
Fixed

> ~~~
>
> 22. dumpPublicationTable
>
>   if (pubrinfo->pubrattrs)
> - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + {
> + if (pubrinfo->pubexcept)
> + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs);
> + else
> + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + }
>
> SUGGESTION
> {
>   if (pubrinfo->pubexcept)
>     appendPQExpBuffer(query, " EXCEPT");
>
>   appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> }
Fixed

I have addressed the comments shared by you and shared the updated v15
patch set here.

Thanks and Regards,
Shlok Kyal

Attachment

pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: Backpatching injection point core facilities to REL_17_STABLE
Next
From: Shlok Kyal
Date:
Subject: Re: Skipping schema changes in publication