Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. - Mailing list pgsql-hackers

From Shubham Khanna
Subject Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Date
Msg-id CAHv8RjKxYJ7y6v8rFFFpfVJeQ4j0ZrFk1PCXM8mb=+9Rdm2hVA@mail.gmail.com
Whole thread Raw
In response to Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Thu, Feb 6, 2025 at 9:39 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shubham.
>
> Here are some review comments for v3-0001.
>
> FYI, I skipped the review of the test code because I saw Kuroda-san
> had already posted some comments about the tests
>
> ======
> Commit message
>
> 1.
> This patch enhances the 'pg_createsubscriber' utility to automatically
> fetch all non-template databases from the publisher and create subscriptions
> for them. This simplifies converting a physical standby to a logical subscriber
> for multiple databases, particularly during upgrades.
>
> ~
>
> AFAIK you are not creating the subscription at the publisher database,
> which is what this description sounds like it is saying. I think you
> are creating the subscriptions on the target server on the *same name*
> databases that you found on the source server (???). Anyway, this
> needs clarification. Also, in some of the comments below, I had the
> same confusion.
>
> ~~~
>
> 2.
> A new function 'fetch_all_databases' in 'pg_createsubscriber.c'
> that queries the publisher for all databases and adds them to the subscription
> options is added.
> Additionally, 'validate_databases' ensures that conflicting options are not
> used together, enforcing correct usage of '--all-databases'.
>
> ~
>
> IMO it is better here to describe the overall logic/algorithms rather
> than saying "This new function does this, and this other new function
> does that". (otherwise, your commit message will quickly become stale
> and need changing all the time).
>
> ~~~
>
> 3.
> The '--all-databases' option fetches all databases from the publisher.
> It auto-generates publication and replication slot names as 'pub_names' and
> 'slot_names' respectively.
>
> ~
>
> 'pub_names'?
> 'slot_names'?
>
> What are those strings? Those are not the names that you wrote in the document.
>
> ~~~
>
> 4.
> This option validates database name lengths to ensure generated names fit
> within system limits.
>
> ~
>
> No. Maybe the patch code ensures this (but I didn't find this
> validation anywhere), but certainly "This option" doesn't.
>
> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 5.
> The synopsis has no mention of --all-databases.
>
> I'm not sure of the best way to do it -- if it becomes too messy to
> combine everything then maybe having several is the way to go:
>
> e.g.
>
> pg_createsubscriber [option...] { -a | --all-databases } { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
> pg_createsubscriber [option...] { -d | --database }dbname { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
>
> ~~~
>
> 6.
> +      <para>
> +       Automatically fetch all non-template databases from the publisher and
> +       create subscriptions for them.
> +       When using <option>--all-databases</option>, publication and
> +       replication slot names will be automatically generated in the format
> +       <literal><replaceable>database</replaceable>_pub</literal> and
> +       <literal><replaceable>database</replaceable>_slot</literal>
> respectively.
> +       When using <option>-a</option>, ensure database names are shorter
> +       than 59 characters to allow for generated publication and slot names.
> +       replica.
> +      </para>
>
> 6a.
> "fetch all non-template databases from the publisher and create
> subscriptions for them" -- Hmm. It's too confusing. I guess you meant
> that you are enumerating all the databases on the source server but
> then creating subscriptions on all the databases at the target server
> that have the same name (???) Anyway, the current text is not clear. I
> also think it is better to use the "target server" and "source server"
> (aka publication-server) terminology.
>
> ~
>
> 6b.
> I don't think you need to say "When using
> <option>--all-databases</option>," because this entire description for
> '--all-databases'
>
> ~
>
> 6c.
> Talks about publication and slot names but no mention of subscription names?
>
> ~
>
> 6d.
> That limitation about the 59 char limit seems a strange thing to have
> documented here. I wonder why you need to say anything at all. It
> might be better if the code just validates all this up-front before
> processing and can raise an ERROR in the very rare times this might
> happen.
>
> ~
>
> 6e.
> Spurious text junk? "replica"
>
> ~
>
> 6f.
> Don't you need to also say something like "Cannot be used with
> '--publication', '--subscription', or '--replication-slot'."
>
> ~~~
>
> 7. --database
>
> -       switches.
> +       switches. If <option>-d</option> and <option>-a</option> are specified
> +       together, <application>pg_createsubscriber</application> will return
> +       an error.
>
> A nicer way to say that would be to spell it out fully. e.g. "Cannot
> be used with --all-databases'.".
>
> ~~~
>
> 8.
> For --publication don't you need to mention "Cannot be used with
> --all-databases'.".
>
> ~~~
>
> 9.
> For --subscription don't you need to mention "Cannot be used with
> --all-databases'.".
>
> ~~~
>
> 10.
> For --replication-slot don't you need to mention "Cannot be used with
> --all-databases'.".
>
>
> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> usage:
>
> 11.
>   printf(_("\nOptions:\n"));
> + printf(_("  -a, --all-databases             fetch and specify all
> databases\n"));
>
> This doesn't really seem a helpful usage text. Should it say something
> more like "create subscriptions for all databases"... (??) **
>
> ** I have difficulty imagining what is the expected behaviour in cases
> where there might be different databases at the source server and
> target server. I think your documentation needs to have some special
> notes about this to make it clear what server you are enumerating
> these databases on, and then what happens if there is some mismatch of
> what databases are available on the *other* server.
>
> ~~~
>
> fetch_all_databases:
>
> 12.
> +/* Placeholder function to fetch all databases from the publisher */
> +static void
> +fetch_all_databases(struct CreateSubscriberOptions *opt)
> +{
>
> 12a.
> What is a "placeholder function"?
>
> ~
>
> 12b.
> I would be nicers to use the same terminology that seems more common
> in the documentation -- e.g. "source server" instead of "publisher".
> Also, the function name should make it clear if you are getting these
> data from the source server or target server.
>
> ~~~
>
> 13.
> + /* Process the query result */
> + num_rows = PQntuples(res);
> + for (int i = 0; i < num_rows; i++)
> + {
> + const char *dbname = PQgetvalue(res, i, 0);
> +
> + simple_string_list_append(&opt->database_names, dbname);
> + num_dbs++;
> + }
> The increment num_dbs++ is a bit sneaky. AFAICT you are effectively
> pretending --all-databases is the equivalent of a whole lot of
> --database so the subsequent logic won't know the difference. I think
> that logic deserves a code comment.
>
> ~~~
>
> 14.
> + /* Check if any databases were added */
> + if (opt->database_names.head == NULL)
> + {
> + pg_log_error("no database names could be fetched or specified");
> + pg_log_error_hint("Ensure that databases exist on the publisher or
> specify a database using --database option.");
> + PQclear(res);
> + PQfinish(conn);
> + exit(1);
> + }
> +
> + PQclear(res);
> + PQfinish(conn);
> +}
>
>
> 14a.
> The "were added" seems odd. I think you mean. Just "Error if no
> databases were found".
>
> ~
>
> 14b.
> Isn't it simpler just to check if num_dbs == 0?
>
> ~
>
> 14c.
> "no database names could be fetched or specified" seems like a weird
> error text. e.g. "specified"?
>
> ~
>
> 14d.
> The hint seems strange to me. If there are no databases found using
> --all-database then how will "specify a database using --database
> option." help the user get a better result?
>
> ~~~
>
> validate_databases:
>
> 15.
> +static void
> +validate_databases(struct CreateSubscriberOptions *opt)
> +{
> + /* Check for conflicting options */
> + if (opt->all_databases && opt->database_names.head != NULL)
> + {
> + pg_log_error("cannot specify --dbname when using --all-databases");
> + exit(1);
> + }
> +
> + /*
> + * Ensure publication and slot names are not manually specified with
> + * --all-databases
> + */
> + if (opt->all_databases &&
> + (opt->pub_names.head != NULL || opt->replslot_names.head != NULL))
> + {
> + pg_log_error("cannot specify --publication or --replication-slot
> when using --all-databases");
> + exit(1);
> + }
> +
>
> I think all this switch-checking is misplaced. It should be happening
> up-front in the main function.
>
> ~~~
>
> 16.
> + /* Auto-generate publication and slot names for all databases */
> + if (opt->all_databases)
> + {
> + SimpleStringListCell *cell;
> +
> + fetch_all_databases(opt);
> +
> + cell = opt->database_names.head;
> +
> + while (cell != NULL)
> + {
> + char slot_name[NAMEDATALEN];
> +
> + snprintf(slot_name, sizeof(slot_name), "%s_slot", cell->val);
> + simple_string_list_append(&opt->replslot_names, slot_name);
> +
> + snprintf(slot_name, sizeof(slot_name), "%s_pub", cell->val);
> + simple_string_list_append(&opt->pub_names, slot_name);
> +
> + cell = cell->next;
> + }
> + }
>
>
> 16a.
> Why is this code checking 'opt->all_databases'? Isn't it only possible
> to get to this function via --all-databases. You can just Assert this.
>
> ~
>
> 16b.
> Why are you reusing 'slot_name' variable like this?
>
> ~
>
> 16c.
> Was there some ERROR checking missing here to ensure the length of the
> dbname is not going to cause pub/slot to exceed NAMEDATALEN?
>
> ~
>
> 16d.
> In hindsight, most of this function seems unnecessary to me. Probably
> you could've done all this pub/slot name assignment inside the
> fetch_all_databases() at the same time as you were doing
> simple_string_list_append(&opt->database_names, dbname); for each
> database.
>
> ~~~
>
> 17.
> - while ((c = getopt_long(argc, argv, "d:D:np:P:s:t:U:v",
> + while ((c = getopt_long(argc, argv, "ad:D:np:P:s:t:U:v",
>   long_options, &option_index)) != -1)
>
>
> Isn't the code within this loop missing all the early switch
> validation for checking you are not combining incompatible switches?
>
> e.g.
> --all-databases - "Cannot specify --all-databases more than once"
> --all-databases - "Cannot combine with --database, --publication,
> --subscription, --replication-slot"
> --database - "Cannot combine with --all-databases"
> --publication - "Cannot combine with --all-databases"
> --subscription - "Cannot combine with --all-databases"
> --replication-slot - "Cannot combine with --all-databases"
>
> ~~~
>
> 18.
> +
> +
>   /* Number of object names must match number of databases */
>
> Remove spurious whitespace.
>
> ======

Fixed the given comments. The attached patch at [1] contains the
suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2BDOQ9vvkKCsmDeFKyM073yRfs47Tn_qywQOdx15pmOMA%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



pgsql-hackers by date:

Previous
From: Shubham Khanna
Date:
Subject: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Next
From: Tomas Vondra
Date:
Subject: Re: Should we allow ALTER OPERATOR CLASS to ADD/DROP operators and procedures?