Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. - Mailing list pgsql-hackers
From | Euler Taveira |
---|---|
Subject | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
Date | |
Msg-id | 6f266ce2-38d4-433f-afc9-b47c48c17509@app.fastmail.com Whole thread Raw |
In response to | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
|
List | pgsql-hackers |
On Wed, Mar 12, 2025, at 3:41 AM, Amit Kapila wrote:
This could be another option, but I feel an option with a tool is moremeaningful than allowing users to do afterward steps.
I wasn't paying much attention to this discussion. The thread title refers to a
general option to clean publisher objects which includes non specified objects.
I was expecting a general solution but it seems to include only publications.
Why? I envision in the future adding an option to publish only a set of tables.
Will this proposal remove tables that were not published and its dependent
objects (data types, functions, ...)? When we add the initial schema
synchronization for logical replication, will this proposal be aligned with it?
It is a mistake not to explore a general solution because you risk shooting
yourself in the foot. If we consider that (a) the start point is a standby
(physical copy) and (b) in most scenarios to setup a logical replica you use
pg_dump that dumps the publications by default, it seems these additional
objects will be expected by the user.
I have some questions, fixes and concerns about this patch.
+ <term><option>-c</option></term>
+ <term><option>--drop-all-publications</option></term>
-c has nothing to do with the long option. As Amit suggested in a previous
email, it is perfectly fine to not include a short option here.
static void setup_subscriber(struct LogicalRepInfo *dbinfo,
- const char *consistent_lsn);
+ const char *consistent_lsn,
+ bool drop_all_publications);
Isn't it better to add this extra option into LogicalRepInfo? It avoids
changing this function signature and the following one.
-static void drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo);
+static void drop_publication_by_name(PGconn *conn, const char *pubname,
+ struct LogicalRepInfo *dbinfo);
By the way, why did you rename the function? It still removes the publication
by name, although, the function name does not contain "_by_name". The proposed
signature also has redundant argument usage.
- drop_publication(conn, &dbinfos.dbinfo[i]);
+ drop_publication_by_name(conn, dbinfos.dbinfo[i].pubname,
+ &dbinfos.dbinfo[i]);
Even with this proposal, the following change is imprecise. It is still
removing publications (plural form). If you still want to include the fact that
pre existing publications will also be removed, say "Remove publications
(including pre existing ones)".
/*
- * Since the publication was created before the consistent LSN, it is
- * available on the subscriber when the physical replica is promoted.
- * Remove publications from the subscriber because it has no use.
+ * Since the publication was created before the consistent LSN, it
+ * remains on the subscriber even after the physical replica is
+ * promoted. Remove this publication from the subscriber because it
+ * has no use. Additionally, if requested, drop all pre-existing
+ * publications.
*/
-drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo)
+drop_publication_by_name(PGconn *conn, const char *pubname,
+ struct LogicalRepInfo *dbinfo)
Keep the current function name.
- char *pubname_esc;
-
- Assert(conn != NULL);
- pubname_esc = PQescapeIdentifier(conn, dbinfo->pubname, strlen(dbinfo->pubname));
+ char *pubname_esc = PQescapeIdentifier(conn, pubname, strlen(pubname));
Why did you remove the assert?
- PQfreemem(pubname_esc);
-
Why did you reallocate this function call?
+/*
+ * Check and drop existing publications on the subscriber.
+ */
+static void
+check_and_drop_existing_publications(PGconn *conn, struct LogicalRepInfo *dbinfo)
AFAICS there is no "check" in this function. I would go with
drop_all_publications().
-my $node_s = PostgreSQL::Test::Cluster->new('node_s');
-$node_s->init_from_backup($node_p, 'backup_1', has_streaming => 1);
-$node_s->append_conf(
+my $node_s1 = PostgreSQL::Test::Cluster->new('node_s1');
+$node_s1->init_from_backup($node_p, 'backup_1', has_streaming => 1);
+$node_s1->append_conf(
Why did you rename node_s? You just increased the number of hunks someone needs
to review. If you need a new standby, fine. Create another node: node_u.
+$node_s2->init_from_backup($node_p, 'backup_3', has_streaming => 1);
+$node_s2->append_conf(
+ 'postgresql.conf', qq[
+ primary_conninfo = '$pconnstr'
+ max_logical_replication_workers = 5
+ ]);
+$node_s2->set_standby_mode();
+$node_s2->start;
Do you really need to create another node just to confirm the current behavior?
I think you can inspect the logfile after a dry run mode to see if some message
pops up. This addition increases the pg_createsubscriber runtime from
Files=1, Tests=37, 10 wallclock secs ( 0.02 usr 0.00 sys + 1.41 cusr 1.67 csys = 3.10 CPU)
to
Files=1, Tests=43, 16 wallclock secs ( 0.03 usr 0.01 sys + 1.70 cusr 2.10 csys = 3.84 CPU)
That's unacceptable.
I'm still concerned about the fact that we are adding one option that is
specific and have to add one per object as soon as someone has another
proposal. We need to decide if we want multiple options to clean up objects in
the future or a general option that will be incrementally adding objects to
remove. Multiple options are more granular and can avoid breaking backward
compatibility (if you don't want) but can increase the list of options you need
to inform if you want to clean multiple object types. A single option is not
flexible and breaks backward compatibility but it does exactly what you want
with a few characters. In most scenarios, if you want to have a clean
subscriber, you will remove *all* possible objects, hence, a single option is
your choice.
pgsql-hackers by date: