Re: Added schema level support for publication. - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Added schema level support for publication. |
Date | |
Msg-id | CAA4eK1+ZYanA51c9NzKM31AqJSw-j0-edGz91+Vh-nsoKdzKfQ@mail.gmail.com Whole thread Raw |
In response to | Re: Added schema level support for publication. (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Sat, Dec 14, 2024 at 5:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I suppose this exchange is what led to this bit in > getPublicationNamespaces: > > /* > * We always dump publication namespaces unless the corresponding > * namespace is excluded from the dump. > */ > if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE) > continue; > > I'd like to push back against this on three separate grounds: > > > 1. The behavior this produces is extremely non-obvious and not > adequately explained by the comment, which makes one wonder how > much of it was intended. For example: > > * The public schema will be included if listed in FOR ALL TABLES IN, > even though it's not dumped explicitly in the dump, because its dump > mask includes other bits besides DUMP_COMPONENT_DEFINITION. OK, that > was probably intentional, but you wouldn't know it from the comment. > > * Schemas created within extensions will be included if listed in FOR > ALL TABLES IN, even though they're not dumped explicitly in the dump. > This seems like a quite accidental by-product of the fact that > checkExtensionMembership will set DUMP_COMPONENT_ACL on extension > member objects, thus making their dump mask not NONE. If this > behavior was intentional, it needs a less-fragile implementation. > > * The information_schema will NOT be included, even if it was listed in > FOR ALL TABLES IN. Admittedly, information_schema doesn't normally > contain any tables that'd be useful to publish. But still, this seems > like randomly ignoring the user's intent. > > > 2. The complaint was that if a schema is excluded from the dump > by --exclude-schema, then it should not get included in the > publication either. I think this is at best highly debatable: > arguably it amounts to breaking the publication. It seems > analogous to deciding that if a function is excluded from the > dump, while a view using the function is included, we should > silently adjust the view by removing the output columns or > WHERE clauses that use the function. I'm pretty sure that > nobody would think that was sane. Perhaps there's a case for > excluding the view as a whole, but we don't do that. Besides, the > corresponding behavior would be to exclude the whole publication, > not silently modify its definition. > > > 3. The corresponding test for individual tables listed in > a publication is coded differently: > > /* > * Ignore publication membership of tables whose definitions are not > * to be dumped. > */ > if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)) > continue; > > This is considerably easier to understand the effects of than a test > on the whole dump mask: it will list the table if we intend to emit > CREATE TABLE, and not otherwise, regardless of side issues like ACLs. > But why is it different from the code for schemas? > > > So I think that this was just wrongly thought through. My > preference would be to either delete the above-quoted bit in > getPublicationNamespaces entirely, or make it look like the > test in getPublicationTables. Or maybe we should delete > both of these tests, on the grounds that redefining the > contents of the publication is far outside pg_dump's charter. > I see a merit in your second suggestion which is to delete these tests in getPublicationTables() and getPublicationNamespaces() because we follow similar behavior in the somewhat related subscription case as well. When a subscription points to a set of publications and we use '--no-publications' option in pg_dump, it still dumps the subscription. I tried it with the following test: Publisher: postgres=# create schema s1; CREATE SCHEMA postgres=# create table t1(c1 int); CREATE TABLE postgres=# create publication pub1 for table t1; CREATE PUBLICATION postgres=# create publication pub2 for tables in schema s1; CREATE PUBLICATION Subscriber: postgres=# create table t1 (c1 int); CREATE TABLE postgres=# create publication pub_on_sub_1 for table t1; CREATE PUBLICATION postgres=# create subscription sub1 connection 'dbname = postgres' publication pub1, pub2; NOTICE: created replication slot "sub1" on publisher CREATE SUBSCRIPTION Now when I performed the dump with '--no-publications' option on the subscriber node, it didn't include publications which is expected but did include a subscription definition pointing to the publications as defined originally. -- With Regards, Amit Kapila.
pgsql-hackers by date: