Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. - Mailing list pgsql-hackers
From | Shlok Kyal |
---|---|
Subject | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
Date | |
Msg-id | CANhcyEUkQybOe=J5KypWQ80deULBLjjsdyP+DF57+BwSni57Vw@mail.gmail.com Whole thread Raw |
In response to | RE: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
List | pgsql-hackers |
On Wed, 29 Jan 2025 at 15:14, Shubham Khanna <khannashubham1197@gmail.com> wrote: > > On Wed, Jan 29, 2025 at 10:42 AM Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Shubham, > > > > > I propose adding the --clean-publisher-objects option to the > > > pg_createsubscriber utility. As discussed in [1], this feature ensures > > > a clean and streamlined setup of logical replication by removing stale > > > or unnecessary publications from the subscriber node. These > > > publications, replicated during streaming replication, become > > > redundant after converting to logical replication and serve no further > > > purpose. This patch introduces the drop_all_publications() function, > > > which efficiently fetches and drops all publications on the subscriber > > > node within a single transaction. > > > > I think replication slot are also type of 'publisher-objects', but they are not > > removed for now: API-name may not be accurate. And... > > > > > Additionally, other related objects, such as subscriptions and > > > replication slots, may also require cleanup. I plan to analyze this > > > further and include them in subsequent patches. > > > > I'm not sure replication slots should be cleaned up. Apart from other items like > > publication/subscription, replication slots are not replicated when it is created > > on the primary instance. This means they are intentionally created by DBAs and there > > may not be no strong reasons to drop them after the conversion. > > > > Another question is the style of APIs. Do you plan to provide APIs like > > 'cleanup-subscriber-objects' and 'cleanup-publisher-objects', or just one > > 'cleanup-logical-replication-objects'? > > > > Thanks for the suggestions, I will keep them in mind while preparing > the 0002 patch for the same. > Currently, I have changed the API to '--cleanup-publisher-objects'. > > > Regarding the patch: > > > > 1. > > ``` > > + The <application>pg_createsubscriber</application> now supports the > > + <option>--clean-publisher-objects</option> to remove all publications on > > + the subscriber node before creating a new subscription. > > ``` > > > > This description is not suitable for the documentation. Something like: > > > > Remove all publications on the subscriber node. > > > > 2. > > ``` > > + /* Drop publications from the subscriber if requested */ > > + drop_all_publications(dbinfo); > > ``` > > > > This should be called when `opts.clean_publisher_objects` is true. > > > > 3. > > You said publications are dropped within a single transaction, but the > > patch does not do. Which is correct? > > > > 4. > > ``` > > +# Set up node A as primary > > +my $node_a = PostgreSQL::Test::Cluster->new('node_a'); > > +my $aconnstr = $node_a->connstr; > > +$node_a->init(allows_streaming => 'logical'); > > +$node_a->append_conf('postgresql.conf', 'autovacuum = off'); > > +$node_a->start; > > + > > +# Set up node B as standby linking to node A > > +$node_a->backup('backup_3'); > > +my $node_b = PostgreSQL::Test::Cluster->new('node_b'); > > +$node_b->init_from_backup($node_a, 'backup_3', has_streaming => 1); > > +$node_b->append_conf( > > + 'postgresql.conf', qq[ > > + primary_conninfo = '$aconnstr' > > + hot_standby_feedback = on > > + max_logical_replication_workers = 5 > > + ]); > > +$node_b->set_standby_mode(); > > +$node_b->start; > > ``` > > > > Fixed the given comments. The attached patch contains the suggested changes. > Hi Shubham, I have reviewed the v2 patch. Here are my comments: 1. Initial of the comment should in smallcase to make it similar to other comments: + bool cleanup_publisher_objects; /* Drop all publications */ 2. We should provide a comment for the function. +static void +drop_all_publications(const struct LogicalRepInfo *dbinfo) +{ 3. We should declare it as "const char*" + char *search_query = "SELECT pubname FROM pg_catalog.pg_publication;"; + 4. Instead of warning we should throw an error here: + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { + pg_log_warning("could not obtain publication information: %s", + PQresultErrorMessage(res)); + 5. Should we throw an error in this case as well? + if (PQresultStatus(res_for_drop) != PGRES_COMMAND_OK) + { + pg_log_warning("could not drop publication \"%s\": %s", + pubname, PQresultErrorMessage(res)); + } 6. We should start the standby server before creating the publications, so that the publications are replicated to standby. +# Create publications to test it's removal +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub FOR ALL TABLES;"); +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub2 FOR ALL TABLES;"); + +# Verify the existing publications +my $pub_count_before = + $node_p->safe_psql($db1, "SELECT COUNT(*) FROM pg_publication;"); Also maybe we should check the publication count on the standby node instead of primary. Thanks and Regards, Shlok Kyal
pgsql-hackers by date: