Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Date
Msg-id CAHut+PvwoLheTMkUgu_syf6cUR+-E62ChOZw9Aohq0i5soeaBw@mail.gmail.com
Whole thread Raw
In response to Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.  (Shubham Khanna <khannashubham1197@gmail.com>)
List pgsql-hackers
On Thu, Feb 20, 2025 at 5:44 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Mon, Feb 17, 2025 at 9:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
...
> > ======
> > 1 GENERAL. Option Name?
> >
> > Wondering why the patch is introducing more terminology like
> > "cleanup"; if we are dropping publications then why not say "drop"?
> > Also I am not sure if "existing" means anything because you cannot
> > cleanup/drop something that is not "existing".
> >
> > IOW, why not call this the "--drop-publications" option?
> >
>
> I have retained the option name as '--cleanup-existing-publications'
> for now. However, I understand the concern regarding terminology and
> will revise it in the next patch version once there is a consensus on
> the final name.
>

BTW, Is it even correct to assume that the user has to choose between
(a) clean everything or (b) clean nothing? That is why I felt
something that mentions only ‘publication’ gives more flexibility.
Later, when some DROP <other_stuff> feature might be added then we can
make additional --drop-XXX switches, or a --drop-all switch for
cleaning everything

//////

Some more review comments for patch v8-0001

======
Commit message

1.
To prevent accidental removal of user-created publications on the subscriber,
this cleanup process only targets publications that existed on the source
server and were replicated to the subscriber.
Any manually created publications on the subscriber will remain intact.

~

Is this scenario (manually created publications on the subscriber)
possible to do?
If yes, then there should be a test case for it
If not, then maybe this whole paragraph needs rewording.

~~~

2.
This feature is optional and only takes effect when the
'--cleanup-existing-publications' option is explicitly specified.
Since publication cleanup is not required when upgrading streaming replication
clusters, this option provides users with the flexibility to enable or skip the
cleanup process as needed.

~

AFAICT, this is pretty much just saying that the switch is optional
and that the feature only does something when the switch is specified
by the user. But, doesn't all that just go without saying?


======
src/bin/pg_basebackup/pg_createsubscriber.c

3.
+ /* Drop all existing publications if requested */
+ if (drop_publications)
+ {
+ pg_log_info("Dropping all existing publications in database \"%s\"",
+ dbinfo[i].dbname);
+ drop_all_publications(conn, dbinfo[i].dbname);
+ }
+
  /*
  * Since the publication was created before the consistent LSN, it is
  * available on the subscriber when the physical replica is promoted.
  * Remove publications from the subscriber because it has no use.
+ * Additionally, drop publications existed before this command if
+ * requested.
  */
  drop_publication(conn, &dbinfo[i]);
~

3a.
The logic of this code seems strange - e.g. it is optionally dropping
all publications, and then dropping yet another one again. Won't that
be already dropped as part of the drop_all?

~

3b.
Anyway, perhaps the logic should be encapsulated in a new helper
check_and_drop_existing_publications() function to match the existing
'check_and_drop_existing_subscriptions()

~

3c.
I felt logs like "Dropping all existing publications in database
\"%s\"" should be logged within the function drop_all_publications,
not at the caller.

~~~

_drop_one_publication:

4.
-/*
- * Remove publication if it couldn't finish all steps.
- */
+/* Helper function to drop a single publication by name. */
 static void
-drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo)
+_drop_one_publication(PGconn *conn, const char *pubname, const char *dbname)

A better name for this helper might be drop_publication_by_name()

~~~

5.
+ appendPQExpBuffer(query, "DROP PUBLICATION %s", pubname);
+ pg_log_debug("Executing: %s", query->data);
+ pg_log_info("Dropping publication %s in database \"%s\"", pubname,
dbinfo->dbname);

That debug seems more typically written as:
pg_log_debug("command is: %s", query->data);

~~~

drop_all_publications:

6.
+/* Drop all publications in the database. */
+static void
+drop_all_publications(PGconn *conn, const char *dbname)
+{
+ PGresult   *res;
+ int count = 0;

The 'count' variable seems unused.

~~~

7.
+ for (int i = 0; i < PQntuples(res); i++)
+ {
+ char    *pubname_esc = PQescapeIdentifier(conn, PQgetvalue(res, i, 0),
+ strlen(PQgetvalue(res, i, 0)));
+
+ _drop_one_publication(conn, pubname_esc, dbname);
+ PQfreemem(pubname_esc);
+ count++;
+ }

Instead of escaping the name here, and also escaping it in
drop_publication(), that could be done inside the common helper
function.

======
.../t/040_pg_createsubscriber.pl

8.
+# Create publications to test their removal
+$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub FOR ALL TABLES;");
+$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub2 FOR ALL TABLES;");

8a.
If you are going to number the publications then it would be better to
number all of them so there is a consistent naming.

e.g. test_pub1,test_pub2; instead of test_pub,test_pub2

~

8b.
Should share the same safe_psql for all DDL when it is possible.

~

8c.
Maybe the comment should give a bit more explanation of what is going
on here. e.g. this is preparation for the following
--cleanup-existing-publications test.

Something that conveys the following (choose your own wording):

Create some user-defined publications. After waiting for the streaming
replication to replicate these to the standby, we can use the
pg_createsubscriber to check that the --cleanup-existing-publications
switch causes them to be removed.

~~~

9.
I wonder if you need 2 test cases
i)  use --cleanup-existing-publications and verify publication are removed
ii) DON'T use --cleanup-existing-publications and verify publication still exist

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Amit Kapila
Date:
Subject: Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding