Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
Date | |
Msg-id | CAHv8Rj+zmkwunNubo+_Gp26S_qXbD=p+rMfEnPnjiEE1A6GTXQ@mail.gmail.com Whole thread Raw |
In response to | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. (Peter Smith <smithpb2250@gmail.com>) |
Responses |
RE: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
|
List | pgsql-hackers |
On Fri, Feb 21, 2025 at 8:31 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 5:44 PM Shubham Khanna > <khannashubham1197@gmail.com> wrote: > > > > On Mon, Feb 17, 2025 at 9:49 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > ... > > > ====== > > > 1 GENERAL. Option Name? > > > > > > Wondering why the patch is introducing more terminology like > > > "cleanup"; if we are dropping publications then why not say "drop"? > > > Also I am not sure if "existing" means anything because you cannot > > > cleanup/drop something that is not "existing". > > > > > > IOW, why not call this the "--drop-publications" option? > > > > > > > I have retained the option name as '--cleanup-existing-publications' > > for now. However, I understand the concern regarding terminology and > > will revise it in the next patch version once there is a consensus on > > the final name. > > > > BTW, Is it even correct to assume that the user has to choose between > (a) clean everything or (b) clean nothing? That is why I felt > something that mentions only ‘publication’ gives more flexibility. > Later, when some DROP <other_stuff> feature might be added then we can > make additional --drop-XXX switches, or a --drop-all switch for > cleaning everything > > ////// > > Some more review comments for patch v8-0001 > > ====== > Commit message > > 1. > To prevent accidental removal of user-created publications on the subscriber, > this cleanup process only targets publications that existed on the source > server and were replicated to the subscriber. > Any manually created publications on the subscriber will remain intact. > > ~ > > Is this scenario (manually created publications on the subscriber) > possible to do? > If yes, then there should be a test case for it > If not, then maybe this whole paragraph needs rewording. > > ~~~ Fixed. > > 2. > This feature is optional and only takes effect when the > '--cleanup-existing-publications' option is explicitly specified. > Since publication cleanup is not required when upgrading streaming replication > clusters, this option provides users with the flexibility to enable or skip the > cleanup process as needed. > > ~ > > AFAICT, this is pretty much just saying that the switch is optional > and that the feature only does something when the switch is specified > by the user. But, doesn't all that just go without saying? > > Fixed. > ====== > src/bin/pg_basebackup/pg_createsubscriber.c > > 3. > + /* Drop all existing publications if requested */ > + if (drop_publications) > + { > + pg_log_info("Dropping all existing publications in database \"%s\"", > + dbinfo[i].dbname); > + drop_all_publications(conn, dbinfo[i].dbname); > + } > + > /* > * Since the publication was created before the consistent LSN, it is > * available on the subscriber when the physical replica is promoted. > * Remove publications from the subscriber because it has no use. > + * Additionally, drop publications existed before this command if > + * requested. > */ > drop_publication(conn, &dbinfo[i]); > ~ > > 3a. > The logic of this code seems strange - e.g. it is optionally dropping > all publications, and then dropping yet another one again. Won't that > be already dropped as part of the drop_all? > > ~ > Fixed. > 3b. > Anyway, perhaps the logic should be encapsulated in a new helper > check_and_drop_existing_publications() function to match the existing > 'check_and_drop_existing_subscriptions() > > ~ > Added a new function 'check_and_drop_exixsting_publications()' > 3c. > I felt logs like "Dropping all existing publications in database > \"%s\"" should be logged within the function drop_all_publications, > not at the caller. > > ~~~ Fixed. > > _drop_one_publication: > > 4. > -/* > - * Remove publication if it couldn't finish all steps. > - */ > +/* Helper function to drop a single publication by name. */ > static void > -drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo) > +_drop_one_publication(PGconn *conn, const char *pubname, const char *dbname) > > A better name for this helper might be drop_publication_by_name() > > ~~~ Removed this function. > > 5. > + appendPQExpBuffer(query, "DROP PUBLICATION %s", pubname); > + pg_log_debug("Executing: %s", query->data); > + pg_log_info("Dropping publication %s in database \"%s\"", pubname, > dbinfo->dbname); > > That debug seems more typically written as: > pg_log_debug("command is: %s", query->data); > > ~~~ Fixed. > > drop_all_publications: > > 6. > +/* Drop all publications in the database. */ > +static void > +drop_all_publications(PGconn *conn, const char *dbname) > +{ > + PGresult *res; > + int count = 0; > > The 'count' variable seems unused. > > ~~~ Removed the usage of 'count'. > > 7. > + for (int i = 0; i < PQntuples(res); i++) > + { > + char *pubname_esc = PQescapeIdentifier(conn, PQgetvalue(res, i, 0), > + strlen(PQgetvalue(res, i, 0))); > + > + _drop_one_publication(conn, pubname_esc, dbname); > + PQfreemem(pubname_esc); > + count++; > + } > > Instead of escaping the name here, and also escaping it in > drop_publication(), that could be done inside the common helper > function. > Fixed. > ====== > .../t/040_pg_createsubscriber.pl > > 8. > +# Create publications to test their removal > +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub FOR ALL TABLES;"); > +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub2 FOR ALL TABLES;"); > > 8a. > If you are going to number the publications then it would be better to > number all of them so there is a consistent naming. > > e.g. test_pub1,test_pub2; instead of test_pub,test_pub2 > > ~ Fixed. > > 8b. > Should share the same safe_psql for all DDL when it is possible. > > ~ Fixed. > > 8c. > Maybe the comment should give a bit more explanation of what is going > on here. e.g. this is preparation for the following > --cleanup-existing-publications test. > > Something that conveys the following (choose your own wording): > > Create some user-defined publications. After waiting for the streaming > replication to replicate these to the standby, we can use the > pg_createsubscriber to check that the --cleanup-existing-publications > switch causes them to be removed. > > ~~~ Fixed. > > 9. > I wonder if you need 2 test cases > i) use --cleanup-existing-publications and verify publication are removed > ii) DON'T use --cleanup-existing-publications and verify publication still exist Added a new test case where the option is not being used and the verification of the user created publications is done. I have created two new nodes for this test case. The attached patch contains the suggested changes. Thanks and regards, Shubham Khanna.
Attachment
pgsql-hackers by date: