Re: Skipping schema changes in publication - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: Skipping schema changes in publication
Date
Msg-id CANhcyEW+uJB_bvQLEaZCgoRTc1=i+QnrPPHxZ2=0SBSCyj9pkg@mail.gmail.com
Whole thread Raw
In response to Re: Skipping schema changes in publication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Skipping schema changes in publication
List pgsql-hackers
On Mon, 11 Aug 2025 at 13:55, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> ...
> > > 5.
> > >   Bitmapset  *cols = NULL;
> > > + bool except_columns = false;
> > > + bool no_col_published = false;
> > >
> > > There are multiple places in this patch that say:
> > >
> > > 'no_col_published'
> > > or 'no_cols_published'
> > >
> > > I felt this var name can be misunderstood because it is easy to read
> > > "no" as meaning "no." (aka number), and then misinterpret as
> > > "number_of_cols_published".
> > >
> > > Maybe an unambiguous name can be found, like
> > > - 'zero_cols_published' or
> > > - 'nothing_published' or
> > > - really make it 'num_cols_published' and check for 0.
> > >
> > > (so this comment applies to multiple places in the patch)
> > >
> > How about 'all_cols_excluded'? Or 'has_published_cols'?
> > I have used 'all_cols_excluded' in this patch. Thoughts?
>
> The new name is good.
>
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > describeOneTableDetails:
> > >
> > > 7.
> > >   /* column list (if any) */
> > >   if (!PQgetisnull(result, i, 2))
> > > - appendPQExpBuffer(&buf, " (%s)",
> > > -   PQgetvalue(result, i, 2));
> > > + {
> > > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + else
> > > + appendPQExpBuffer(&buf, " (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + }
> > >
> > > Isn't this code fragment (and also surrounding code) using the same
> > > logic as what is already encapsulated in the function
> > > addFooterToPublicationDesc()?
> > > Superficially, it seems like a large chunk can all be replaced with a
> > > single call to the existing function.
> > >
> > 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> > "schema_name.table_name" EXCEPT (column-list)
> > Whereas code pasted above is executed when we use \d+ table_name and
> > the output is the format:
> > "publication_name" EXCEPT (column-list)
> >
> > These pieces of code are used to print different info. One is used to
> > print info related to tables and the other is used to print info
> > related to publication.
> > Should we use a common function for this?
>
> It still seems like quite a lot of overlap. e.g. I thought there were
> ~30 lines common. OTOH, perhaps you'll need to pass another boolean to
> the function to indicate it is a "Publication:" footer. I guess you'd
> have to try it out first to see if the changes required to save those
> 30 LOC are worthwhile or not.
>
I have added the code changes for the same in this patch.

> >
> > > ======
> > > src/test/regress/expected/publication.out
> > >
> > > 8.
> > > +-- Syntax error EXCEPT without a col-list
> > > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > > +ERROR:  EXCEPT clause not allowed for table without column list
> > > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > > +                                               ^
> > >
> > > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > > the word "TABLE" instead of "EXCEPT"?
> > >
> > In function 'preprocess_pubobj_list' the position of position marker
> > (^) is decided by "pubobj->location". Function handles multiple errors
> > and setting "$$->location" only specific to EXCEPT qualifier would not
> > be appropriate. One solution I feel is to not show "position marker
> > (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> > 'PublicationTable' for except_location but I think we should not do
> > that. Thoughts?
>
> In the review comments below, I suggest putting this location back,
> but changing the message.
>
> >
> > For this version of patch, I have removed the "position marker (^)" in
> > the case of EXCEPT.
> >
>
> //////
>
> Here are my review comments for the patch v19-0003.
>
> ======
> 1. General - SGML tags in docs for table/column names.
>
> There is nothing to change just yet, but keep an eye on the thread
> [1],  because if/when that gets pushed, then there will several tags
> in this patch for table/column names that will need to be updated for
> consistency.
>
Noted

> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 2.
> +
> + if (!nulls[2])
> + {
> + Datum exceptDatum;
> + bool isnull;
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and
> + * not FOR TABLES IN SCHEMA. So if prexcept is true, it
> + * indicates that prattrs contains columns to be excluded for
> + * replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum))
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
> + }
>
> Maybe this should be done a few lines earlier, to keep all the
> values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
> code. Indeed, there is lots of other values[2]/nulls[2] logic that
> comes later in this function, so maybe it is better to do all of that
> first, instead of mingling it with values[3]/nulls[3].
>
> ======
> src/backend/commands/publicationcmds.c
>
> pub_contains_invalid_column:
>
> 3.
>   * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
> - *    by the column list. If any column is missing, *invalid_column_list is set
> + *    by the column list and are not part of column list specified with EXCEPT.
> + *   If any column is missing, *invalid_column_list is set
>   *    to true.
>
> Whitespace problem here; there is some tab instead of space in this comment.
>
> Also /part of column list/part of the column list/
>
> ~~~
>
> AlterPublicationTables:
>
> 4.
>   bool isnull = true;
>   Datum whereClauseDatum;
>   Datum columnListDatum;
> + Datum exceptDatum;
>
> It's not necessary to have all these different Datum variables; they
> are only temporary storage. It might be simpler to use a single "Datum
> datum;" which is reused 3x.
>
> ~
>
> 5.
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull)
> + oldexcept = DatumGetBool(exceptDatum);
> +
>
> Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
> (column-list)? In other words, should the change to this function be
> done already in one of the earlier patches?
>
> ~
This code path is only executed when running ALTER PUBLICATION ... SET
TABLE and running this command on a  ALL TABLES publication throws an
error due to check by function 'CheckAlterPublication' . And EXCEPT
TABLE can only be used for ALL TABLES publications, I think it doesn’t
need to be moved to the 0002 patch.

>
> 6.
>   if (equal(oldrelwhereclause, newpubrel->whereClause) &&
> - bms_equal(oldcolumns, newcolumns))
> + bms_equal(oldcolumns, newcolumns) &&
> + oldexcept == newpubrel->except)
>
> The code comment about this code fragment should also mention EXCEPT.
>
> ======
> src/backend/parser/gram.y
>
> preprocess_pubobj_list:
>
> 7.
> + if (pubobj->pubtable && pubobj->pubtable->except &&
> + pubobj->pubtable->columns == NULL)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("EXCEPT clause not allowed for table without column list"));
> +
>
> Having the syntax error location (like before in v18) might be better,
> but since that location is associated with the TABLE, then the error
> message should also be reworded so the subject is the table.
>
> SUGGESTION
> errmsg("table without column list cannot use EXCEPT clause")
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 8.
> - if (pset.sversion >= 150000)
> + if (pset.sversion >= 190000)
>   {
>   printfPQExpBuffer(&buf,
>     "SELECT pubname\n"
>     "     , NULL\n"
>     "     , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
>     "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
>     "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> @@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
>     "                pg_catalog.pg_attribute\n"
>     "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
>     "        ELSE NULL END) "
> +   " , prexcept "
>     "FROM pg_catalog.pg_publication p\n"
>     " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
>     " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> -   "WHERE pr.prrelid = '%s'\n",
> -   oid, oid, oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
> +   "WHERE pr.prrelid = '%s' "
> +   "AND  c.relnamespace NOT IN (\n "
> +   " SELECT pnnspid FROM\n"
> +   " pg_catalog.pg_publication_namespace)\n"
>
> - appendPQExpBuffer(&buf,
>     "UNION\n"
>     "SELECT pubname\n"
>     " , NULL\n"
>     " , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
> -   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
> -   oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBuffer(&buf,
> -   "     AND NOT EXISTS (\n"
> -   " SELECT 1\n"
> -   " FROM pg_catalog.pg_publication_rel pr\n"
> -   " JOIN pg_catalog.pg_class pc\n"
> -   " ON pr.prrelid = pc.oid\n"
> -   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
> -   oid);
> -
> - appendPQExpBufferStr(&buf, "ORDER BY 1;");
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "     AND NOT EXISTS (\n"
> +   " SELECT 1\n"
> +   " FROM pg_catalog.pg_publication_rel pr\n"
> +   " JOIN pg_catalog.pg_class pc\n"
> +   " ON pr.prrelid = pc.oid\n"
> +   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid, oid);
> + }
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> +   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , pg_get_expr(pr.prqual, c.oid)\n"
> +   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> +   "         (SELECT string_agg(attname, ', ')\n"
> +   "           FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> +   "                pg_catalog.pg_attribute\n"
> +   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> +   "        ELSE NULL END) "
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid);
>
> I found these large SQL selects with 3x UNIONs are difficult to read.
> Maybe you can add more comments to describe the intention of each of
> the UNION SELECTs?
>
> ~~~
>
> 9.
>   /* column list (if any) */
>   if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT");
> + appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
> + }
>
> I did not find any regression test case where the "EXCEPT" col-list is
> getting output for a "Publications:" footer.
>
> ======
> [1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us
>

I have addressed the comments and the changes in v20 patch.

Thanks,
Shlok Kyal

Attachment

pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: Improve hash join's handling of tuples with null join keys
Next
From: Andrey Borodin
Date:
Subject: Re: Fix for typo in UUIDv7 timestamp extraction