Re: Added schema level support for publication. - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Added schema level support for publication.
Date
Msg-id CAA4eK1+ZYanA51c9NzKM31AqJSw-j0-edGz91+Vh-nsoKdzKfQ@mail.gmail.com
Whole thread Raw
In response to Re: Added schema level support for publication.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Dec 14, 2024 at 5:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I suppose this exchange is what led to this bit in
> getPublicationNamespaces:
>
>         /*
>          * We always dump publication namespaces unless the corresponding
>          * namespace is excluded from the dump.
>          */
>         if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
>             continue;
>
> I'd like to push back against this on three separate grounds:
>
>
> 1. The behavior this produces is extremely non-obvious and not
> adequately explained by the comment, which makes one wonder how
> much of it was intended.  For example:
>
> * The public schema will be included if listed in FOR ALL TABLES IN,
> even though it's not dumped explicitly in the dump, because its dump
> mask includes other bits besides DUMP_COMPONENT_DEFINITION.  OK, that
> was probably intentional, but you wouldn't know it from the comment.
>
> * Schemas created within extensions will be included if listed in FOR
> ALL TABLES IN, even though they're not dumped explicitly in the dump.
> This seems like a quite accidental by-product of the fact that
> checkExtensionMembership will set DUMP_COMPONENT_ACL on extension
> member objects, thus making their dump mask not NONE.  If this
> behavior was intentional, it needs a less-fragile implementation.
>
> * The information_schema will NOT be included, even if it was listed in
> FOR ALL TABLES IN.  Admittedly, information_schema doesn't normally
> contain any tables that'd be useful to publish.  But still, this seems
> like randomly ignoring the user's intent.
>
>
> 2. The complaint was that if a schema is excluded from the dump
> by --exclude-schema, then it should not get included in the
> publication either.  I think this is at best highly debatable:
> arguably it amounts to breaking the publication.  It seems
> analogous to deciding that if a function is excluded from the
> dump, while a view using the function is included, we should
> silently adjust the view by removing the output columns or
> WHERE clauses that use the function.  I'm pretty sure that
> nobody would think that was sane.  Perhaps there's a case for
> excluding the view as a whole, but we don't do that.  Besides, the
> corresponding behavior would be to exclude the whole publication,
> not silently modify its definition.
>
>
> 3. The corresponding test for individual tables listed in
> a publication is coded differently:
>
>         /*
>          * Ignore publication membership of tables whose definitions are not
>          * to be dumped.
>          */
>         if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION))
>             continue;
>
> This is considerably easier to understand the effects of than a test
> on the whole dump mask: it will list the table if we intend to emit
> CREATE TABLE, and not otherwise, regardless of side issues like ACLs.
> But why is it different from the code for schemas?
>
>
> So I think that this was just wrongly thought through.  My
> preference would be to either delete the above-quoted bit in
> getPublicationNamespaces entirely, or make it look like the
> test in getPublicationTables.  Or maybe we should delete
> both of these tests, on the grounds that redefining the
> contents of the publication is far outside pg_dump's charter.
>

I see a merit in your second suggestion which is to delete these tests
in getPublicationTables() and getPublicationNamespaces() because we
follow similar behavior in the somewhat related subscription case as
well. When a subscription points to a set of publications and we use
'--no-publications' option in pg_dump, it still dumps the
subscription. I tried it with the following test:

Publisher:
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table t1(c1 int);
CREATE TABLE
postgres=# create publication pub1 for table t1;
CREATE PUBLICATION
postgres=# create publication pub2 for tables in schema s1;
CREATE PUBLICATION

Subscriber:
postgres=# create table t1 (c1 int);
CREATE TABLE
postgres=# create publication pub_on_sub_1 for table t1;
CREATE PUBLICATION
postgres=# create subscription sub1 connection 'dbname = postgres'
publication pub1, pub2;
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION

Now when I performed the dump with '--no-publications' option on the
subscriber node, it didn't include publications which is expected but
did include a subscription definition pointing to the publications as
defined originally.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Regression tests fail on OpenBSD due to low semmns value
Next
From: John Naylor
Date:
Subject: Re: Proposal for Updating CRC32C with AVX-512 Algorithm.