Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
Date | |
Msg-id | CAHv8RjLgLCbg-e0_5Ly_-8AojraV1R1g4BuaL-an7+iTLMycLg@mail.gmail.com Whole thread Raw |
In response to | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
RE: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
List | pgsql-hackers |
On Mon, Feb 17, 2025 at 9:49 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Shubham > > Some review comments for v7-0001. > > (I am late to this thread. If some of my comments have already been > discussed and rejected please let me know). > > ====== > 1 GENERAL. Option Name? > > Wondering why the patch is introducing more terminology like > "cleanup"; if we are dropping publications then why not say "drop"? > Also I am not sure if "existing" means anything because you cannot > cleanup/drop something that is not "existing". > > IOW, why not call this the "--drop-publications" option? > I have retained the option name as '--cleanup-existing-publications' for now. However, I understand the concern regarding terminology and will revise it in the next patch version once there is a consensus on the final name. > ====== > Commit message > > 2. > These publications, replicated during streaming replication, become redundant > after converting to logical replication and serve no further purpose. > > ~ > > From this description it seems there is an assumption that the only > publications on the target server are those that were physically > replicated to the standby. Is that strictly true? Isn't it also > possible that a user might have created their own publication on the > target server prior to running the pg_createsubscriber. So even if > they want all the physically replicated ones to be removed, they would > NOT want their own new publication to also get removed at the same > time. > > E.g. The original motivation thread [1] for this patch only said "But > what good is having the SAME publications as primary also on logical > replica?" (my emphasis of "same") so it seems there should be some > sort of name matching before just dropping everything. > > Actually.... The code looks like it might be doing the correct thing > already and only fetching the publication names from the source server > and then deleting only those names from the target server. But this > comment message didn't describe this clearly. > > ====== Modified the commit message. > doc/src/sgml/ref/pg_createsubscriber.sgml > > 3. > + <para> > + Remove all existing publications from specified databases on tne target > + server. > + </para> > > typo "tne" > > ====== Fixed. > src/bin/pg_basebackup/pg_createsubscriber.c > > struct CreateSubscriberOptions: > > 4. > + bool cleanup_existing_publications; /* drop all publications */ > > The field name seems overkill. e.g. As mentioned for the option, it > could be called 'drop' instead of cleanup. And the 'existing' seems > superfluous because you can only drop something that exists. So why > not just 'drop_publications'. Won't that have the same meaning? > > ~~~ > Fixed. > 5. > static void > -setup_subscriber(struct LogicalRepInfo *dbinfo, const char *consistent_lsn) > +setup_subscriber(struct LogicalRepInfo *dbinfo, const char *consistent_lsn, > + bool cleanup_existing_publications) > > The setup_subscriber's function comment does not say anything about > this function potentially also dropping publications at the > subscriber. > > ~~~ > Fixed. > 6. > static void > -drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo) > +drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo, > + bool cleanup_existing_publications) > > 6a. > This arrangement seems complicated to me. > > IMO it would be more natural to have 2 separate functions, and just > call the appropriate one. > drop_publication() > drop_all_publications() > > instead of trying to make this function do everything. > > ~ > > 6b. > Furthermore, can't you just have a common helper function to DROP a > single PUBLICATION by name? > > Then the code that drops all publications can just loop to call this > common dropper for each iteration. Code should be much simpler. I > don't see the efficiency of this operation is really a factor, > pg_createsubscriber is rarely used, so IMO a better goal is code > simplicity/maintenance. > > e.g. drop_publication() --> _drop_one_publication() > e.g drop_all_publications() --> LOOP (pub list) { _drop_one_publication() } > > ====== Fixed. > .../t/040_pg_createsubscriber.pl > > 7. > + > +# Create publications to test it's removal > +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub FOR ALL TABLES;"); > +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub2 FOR ALL TABLES;"); > + > > /it's/their/ > > ~~~ > Fixed. > 8. > Maybe also do a CREATE PUBLICATION at node_s, prior to the > pg_createsubvscript, so then you can verify that the user-created one > is unaffected by the cleanup of all the others. > > ====== Since $node_s is a streaming standby, it does not allow object creation. As a result, publications cannot be created on $node_s. > [1] https://www.postgresql.org/message-id/CAExHW5t4ew7ZrgcDdTv7YmuG7LVQT1ZaEny_EvtngHtEBNyjcQ%40mail.gmail.com > The attached patch contains the suggested changes. Thanks and regards, Shubham Khanna.
Attachment
pgsql-hackers by date: