Thread: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

On Fri, Jan 24, 2025 at 7:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Jan 24, 2025 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Jan 23, 2025 at 10:33 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Wed, Jan 22, 2025 at 7:29 PM Shubham Khanna
> > > <khannashubham1197@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am writing to propose an enhancement to the pg_createsubscriber
> > > > utility that enables it to automatically fetch all non-template
> > > > databases from the publisher when no specific databases are specified
> > > > by the user. This was an open item from [1] that was planned for
> > > > future implementation. The attached patch has the changes for the
> > > > same.
> > >
> > > I think the feature will be useful, but UI might cause some unwanted
> > > results. If a user forgets to specify -d option, the utility will
> > > create subscriptions to all the databases, some of which may or may
> > > not have the publications. I think it's better to provide an option to
> > > specify all databases explicitly (e.g. --all-databases).
> > >
> >
> >
> > +1 better to be safe.
> >
> > Instead of a new switch, how about changing the --database switch to
> > accept a pattern (like pg_dump --schema does [1])
> >
> > Then "all databases" would be specified something like --database = *
> >
>
> WFM but that will be more work than what's in the patch.
>

OK, what if, instead of full pattern matching it could recognise just
one special dbname value of '*' (meaning "all")

So, "all databases" could still be specified as --database = *

The implementation would be almost no more work than the current
patch, while at the same time leaving it open to be extended as a
pattern if needed in the future. Or, is it too hacky?

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Tue, Jan 28, 2025 at 3:58 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Jan 24, 2025 at 7:28 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Jan 24, 2025 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > On Thu, Jan 23, 2025 at 10:33 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 22, 2025 at 7:29 PM Shubham Khanna
> > > > <khannashubham1197@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I am writing to propose an enhancement to the pg_createsubscriber
> > > > > utility that enables it to automatically fetch all non-template
> > > > > databases from the publisher when no specific databases are specified
> > > > > by the user. This was an open item from [1] that was planned for
> > > > > future implementation. The attached patch has the changes for the
> > > > > same.
> > > >
> > > > I think the feature will be useful, but UI might cause some unwanted
> > > > results. If a user forgets to specify -d option, the utility will
> > > > create subscriptions to all the databases, some of which may or may
> > > > not have the publications. I think it's better to provide an option to
> > > > specify all databases explicitly (e.g. --all-databases).
> > > >
> > >
> > >
> > > +1 better to be safe.
> > >
> > > Instead of a new switch, how about changing the --database switch to
> > > accept a pattern (like pg_dump --schema does [1])
> > >
> > > Then "all databases" would be specified something like --database = *
> > >
> >
> > WFM but that will be more work than what's in the patch.
> >
>
> OK, what if, instead of full pattern matching it could recognise just
> one special dbname value of '*' (meaning "all")
>
> So, "all databases" could still be specified as --database = *
>
> The implementation would be almost no more work than the current
> patch, while at the same time leaving it open to be extended as a
> pattern if needed in the future. Or, is it too hacky?

I don't remember any precedence here. pg_dump has pg_dumpall which
dumps all the databases, so they chose to create a separate binary. If
we go that route, I think we will be able to produce a more flexible
utility, like replication slot names per database, or per database
subscription settings etc. Maybe we should consider that option.

If we want to stick to --database= supporting a pattern looks better
than just a single special pattern *.

--
Best Wishes,
Ashutosh Bapat



On Tue, Jan 28, 2025 at 12:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> If we want to stick to --database= supporting a pattern looks better
> than just a single special pattern *.
>

This sounds reasonable to me as well. Note that the interaction of
other parameters like --replication-slot is not yet discussed. I think
if the number of slots given matches with the number of databases
fetched based on pattern matches then we can use them otherwise,
return the ERROR. The other option could be that we don't allow
options like --replication-slot along with pattern matching option.

--
With Regards,
Amit Kapila.



On Wed, Jan 29, 2025 at 4:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 28, 2025 at 12:01 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > If we want to stick to --database= supporting a pattern looks better
> > than just a single special pattern *.
> >
>
> This sounds reasonable to me as well. Note that the interaction of
> other parameters like --replication-slot is not yet discussed. I think
> if the number of slots given matches with the number of databases
> fetched based on pattern matches then we can use them otherwise,
> return the ERROR. The other option could be that we don't allow
> options like --replication-slot along with pattern matching option.
>

I have had second thoughts about my pattern idea. Now, I favour just
adding another --all-databases switch like Ashutosh had suggested [1]
in the first place.

I had overlooked the rules saying that the user is allowed to specify
*multiple* --publication or --replication-slot or --subscription name
switches, but when doing so they have to match the same number of
--database switches. Using a --dbname=pattern would be fraught with
complications. e.g. How can we know up-front how many databases the
dbname pattern will resolve to, and even in what order they get
resolved?

In hindsight, it would be much easier to just have one extra switch,
so the rules then are simple:

--all-databases  (here you CANNOT specify multiple
publication/replication-slot/subscription switches)
-database=dbname (here you CAN specify multiple
publication/replication-slot/subscription switches using the same
order and the same number as databases)

Also, --all-databases and --database switches cannot be specified at
the same time.

======
[1] https://www.postgresql.org/message-id/CAExHW5sQGie7bvS-q7YUYDM2BqYZ%3D%2BxqeqFUS%3DcZGjK_9pnVzQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Hi Shubham,

On Tue, Feb 4, 2025 at 2:10 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
> > >
> >
> > It could be a bit tricky to find that for users but they can devise a
> > query to get the names and numbers of databases matching the given
> > pattern.  OTOH, I am not sure there is a clear need at this stage for
> > pattern matching for this tool. So, we can go with a simple switch as
> > you are proposing at this stage.
> >
>
> After reconsidering the idea of supporting '--all-databases' switch is
> the better approach at this stage, I have added the new switch in the
> latest patch.
> The attached patch contains the suggested changes.

+ If neither <option>-d</option> nor <option>-a</option> is
+       specified, <application>pg_createsubscriber</application> will use
+       <option>--all-databases</option> by default.

As pointed upthread by me and Peter, using --all-databases by default
is not a good behaviour.

But the code doesn't behave like --all-databases by default. Looks
like we need to fix the documentation.

+ /* Generate publication and slot names if not specified */
+ SimpleStringListCell *cell;
+
+ fetch_all_databases(opt);
+
+ cell = opt->database_names.head;

We don't seem to check existence of publication and slot name
specification as the comment indicates. Do we need to check that those
names are not specified at all? and also mention in the documentation
that those specifications are required when using -a/--all-databases
option?


--
Best Wishes,
Ashutosh Bapat



Dear Shubham,

Thanks for working on it. I have some comments for the patch.

01. fetch_all_databases
```
+/* Placeholder function to fetch all databases from the publisher */
+static void
+fetch_all_databases(struct CreateSubscriberOptions *opt)
```

Please update the comment atop function.

02. fetch_all_databases
```
+    /* Establish a connection to the PostgreSQL server */
+    conn = PQconnectdb(opt->pub_conninfo_str);
+    /* Check for connection errors */
+    if (PQstatus(conn) != CONNECTION_OK)
+    {
+        pg_log_error("connection to the PostgreSQL server failed: %s", PQerrorMessage(conn));
+        PQfinish(conn);
+        exit(1);
+    }
```

You can use connect_database() instead of directly calling PQconnectdb().

03. fetch_all_databases
```
+    const char *query = "SELECT datname FROM pg_database WHERE datistemplate = false";
```

We should verify the attribute datallowconn, which indicates we can connect to the
database. If it is false, no one would be able to connect to the db and define anything.

04. fetch_all_databases
```
+    /* Check if any databases were added */
+    if (opt->database_names.head == NULL)
+    {
+        pg_log_error("no database names could be fetched or specified");
+        pg_log_error_hint("Ensure that databases exist on the publisher or specify a database using --database
option.");
+        PQclear(res);
+        PQfinish(conn);
+        exit(1);
+    }
```

It is enough to check num_rows > 0.

05. fetch_all_databases
```
+    PQfinish(conn);
```

Just in case: we can use disconnect_database().

06. validate_databases
```
+    /* Check for conflicting options */
+    if (opt->all_databases && opt->database_names.head != NULL)
+    {
+        pg_log_error("cannot specify --dbname when using --all-databases");
+        exit(1);
+    }
+
+    /*
+     * Ensure publication and slot names are not manually specified with
+     * --all-databases
+     */
+    if (opt->all_databases &&
+        (opt->pub_names.head != NULL || opt->replslot_names.head != NULL))
+    {
+        pg_log_error("cannot specify --publication or --replication-slot when using --all-databases");
+        exit(1);
+    }
```

I think validations should be done in main() like other paramters.

07. validate_databases
```
+    if (opt->all_databases)
+    {
```

This function is called when all_databases is true, so this is not needed.

08. validate_databases
```
+        cell = opt->database_names.head;
+
+        while (cell != NULL)
+        {
+            char        slot_name[NAMEDATALEN];
+
+            snprintf(slot_name, sizeof(slot_name), "%s_slot", cell->val);
+            simple_string_list_append(&opt->replslot_names, slot_name);
+
+            snprintf(slot_name, sizeof(slot_name), "%s_pub", cell->val);
+            simple_string_list_append(&opt->pub_names, slot_name);
+
+            cell = cell->next;
+        }
```

I'm not sure the part. It seems to me that you tried to set slotname to ${dbname}_slot
and pubname to ${dbname}_pub, but this the current naming rule. Since we can
generate names in setup_publisher(), I think we do not have to do something here.

09.
```
@@ -2117,6 +2233,8 @@ main(int argc, char **argv)
         }
     }
 
+
+
```

Unnesesarry blank.

10. 040_pg_createsubscriber.pl
```
+# Fetch the count of non-template databases on the publisher before
+# running pg_createsubscriber without '--all-databases' option
+my $db_count_before =
+  $node_a->safe_psql('postgres',
+    "SELECT count(*) FROM pg_database WHERE datistemplate = false;");
+is($db_count_before, '1', 'database count without --all-databases option');
```

This test is not actually realted with our codes, no need to do.

11. 040_pg_createsubscriber.pl
```
+# Ensure there are some user databases on the publisher
+$node_a->safe_psql('postgres', 'CREATE DATABASE db3');
+$node_a->safe_psql('postgres', 'CREATE DATABASE db4');
+
+$node_b->stop;
```

You must wait until all changes have been replicated to the standby here.

12. 040_pg_createsubscriber.pl

Can we do the similar tests without adding new instances? E.g., runs with
dry-run mode and confirms all databases become target.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Thu, Feb 6, 2025 at 7:37 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for working on it. I have some comments for the patch.
>
> 01. fetch_all_databases
> ```
> +/* Placeholder function to fetch all databases from the publisher */
> +static void
> +fetch_all_databases(struct CreateSubscriberOptions *opt)
> ```
>
> Please update the comment atop function.
>
> 02. fetch_all_databases
> ```
> +       /* Establish a connection to the PostgreSQL server */
> +       conn = PQconnectdb(opt->pub_conninfo_str);
> +       /* Check for connection errors */
> +       if (PQstatus(conn) != CONNECTION_OK)
> +       {
> +               pg_log_error("connection to the PostgreSQL server failed: %s", PQerrorMessage(conn));
> +               PQfinish(conn);
> +               exit(1);
> +       }
> ```
>
> You can use connect_database() instead of directly calling PQconnectdb().
>
> 03. fetch_all_databases
> ```
> +       const char *query = "SELECT datname FROM pg_database WHERE datistemplate = false";
> ```
>
> We should verify the attribute datallowconn, which indicates we can connect to the
> database. If it is false, no one would be able to connect to the db and define anything.
>
> 04. fetch_all_databases
> ```
> +       /* Check if any databases were added */
> +       if (opt->database_names.head == NULL)
> +       {
> +               pg_log_error("no database names could be fetched or specified");
> +               pg_log_error_hint("Ensure that databases exist on the publisher or specify a database using
--databaseoption."); 
> +               PQclear(res);
> +               PQfinish(conn);
> +               exit(1);
> +       }
> ```
>
> It is enough to check num_rows > 0.
>
> 05. fetch_all_databases
> ```
> +       PQfinish(conn);
> ```
>
> Just in case: we can use disconnect_database().
>
> 06. validate_databases
> ```
> +       /* Check for conflicting options */
> +       if (opt->all_databases && opt->database_names.head != NULL)
> +       {
> +               pg_log_error("cannot specify --dbname when using --all-databases");
> +               exit(1);
> +       }
> +
> +       /*
> +        * Ensure publication and slot names are not manually specified with
> +        * --all-databases
> +        */
> +       if (opt->all_databases &&
> +               (opt->pub_names.head != NULL || opt->replslot_names.head != NULL))
> +       {
> +               pg_log_error("cannot specify --publication or --replication-slot when using --all-databases");
> +               exit(1);
> +       }
> ```
>
> I think validations should be done in main() like other paramters.
>
> 07. validate_databases
> ```
> +       if (opt->all_databases)
> +       {
> ```
>
> This function is called when all_databases is true, so this is not needed.
>
> 08. validate_databases
> ```
> +               cell = opt->database_names.head;
> +
> +               while (cell != NULL)
> +               {
> +                       char            slot_name[NAMEDATALEN];
> +
> +                       snprintf(slot_name, sizeof(slot_name), "%s_slot", cell->val);
> +                       simple_string_list_append(&opt->replslot_names, slot_name);
> +
> +                       snprintf(slot_name, sizeof(slot_name), "%s_pub", cell->val);
> +                       simple_string_list_append(&opt->pub_names, slot_name);
> +
> +                       cell = cell->next;
> +               }
> ```
>
> I'm not sure the part. It seems to me that you tried to set slotname to ${dbname}_slot
> and pubname to ${dbname}_pub, but this the current naming rule. Since we can
> generate names in setup_publisher(), I think we do not have to do something here.
>
> 09.
> ```
> @@ -2117,6 +2233,8 @@ main(int argc, char **argv)
>                 }
>         }
>
> +
> +
> ```
>
> Unnesesarry blank.
>
> 10. 040_pg_createsubscriber.pl
> ```
> +# Fetch the count of non-template databases on the publisher before
> +# running pg_createsubscriber without '--all-databases' option
> +my $db_count_before =
> +  $node_a->safe_psql('postgres',
> +       "SELECT count(*) FROM pg_database WHERE datistemplate = false;");
> +is($db_count_before, '1', 'database count without --all-databases option');
> ```
>
> This test is not actually realted with our codes, no need to do.
>
> 11. 040_pg_createsubscriber.pl
> ```
> +# Ensure there are some user databases on the publisher
> +$node_a->safe_psql('postgres', 'CREATE DATABASE db3');
> +$node_a->safe_psql('postgres', 'CREATE DATABASE db4');
> +
> +$node_b->stop;
> ```
>
> You must wait until all changes have been replicated to the standby here.
>
> 12. 040_pg_createsubscriber.pl
>
> Can we do the similar tests without adding new instances? E.g., runs with
> dry-run mode and confirms all databases become target.
>

Fixed the given comments. The attached patch at [1] contains the
suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2BDOQ9vvkKCsmDeFKyM073yRfs47Tn_qywQOdx15pmOMA%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Thu, Feb 6, 2025 at 9:39 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shubham.
>
> Here are some review comments for v3-0001.
>
> FYI, I skipped the review of the test code because I saw Kuroda-san
> had already posted some comments about the tests
>
> ======
> Commit message
>
> 1.
> This patch enhances the 'pg_createsubscriber' utility to automatically
> fetch all non-template databases from the publisher and create subscriptions
> for them. This simplifies converting a physical standby to a logical subscriber
> for multiple databases, particularly during upgrades.
>
> ~
>
> AFAIK you are not creating the subscription at the publisher database,
> which is what this description sounds like it is saying. I think you
> are creating the subscriptions on the target server on the *same name*
> databases that you found on the source server (???). Anyway, this
> needs clarification. Also, in some of the comments below, I had the
> same confusion.
>
> ~~~
>
> 2.
> A new function 'fetch_all_databases' in 'pg_createsubscriber.c'
> that queries the publisher for all databases and adds them to the subscription
> options is added.
> Additionally, 'validate_databases' ensures that conflicting options are not
> used together, enforcing correct usage of '--all-databases'.
>
> ~
>
> IMO it is better here to describe the overall logic/algorithms rather
> than saying "This new function does this, and this other new function
> does that". (otherwise, your commit message will quickly become stale
> and need changing all the time).
>
> ~~~
>
> 3.
> The '--all-databases' option fetches all databases from the publisher.
> It auto-generates publication and replication slot names as 'pub_names' and
> 'slot_names' respectively.
>
> ~
>
> 'pub_names'?
> 'slot_names'?
>
> What are those strings? Those are not the names that you wrote in the document.
>
> ~~~
>
> 4.
> This option validates database name lengths to ensure generated names fit
> within system limits.
>
> ~
>
> No. Maybe the patch code ensures this (but I didn't find this
> validation anywhere), but certainly "This option" doesn't.
>
> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 5.
> The synopsis has no mention of --all-databases.
>
> I'm not sure of the best way to do it -- if it becomes too messy to
> combine everything then maybe having several is the way to go:
>
> e.g.
>
> pg_createsubscriber [option...] { -a | --all-databases } { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
> pg_createsubscriber [option...] { -d | --database }dbname { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
>
> ~~~
>
> 6.
> +      <para>
> +       Automatically fetch all non-template databases from the publisher and
> +       create subscriptions for them.
> +       When using <option>--all-databases</option>, publication and
> +       replication slot names will be automatically generated in the format
> +       <literal><replaceable>database</replaceable>_pub</literal> and
> +       <literal><replaceable>database</replaceable>_slot</literal>
> respectively.
> +       When using <option>-a</option>, ensure database names are shorter
> +       than 59 characters to allow for generated publication and slot names.
> +       replica.
> +      </para>
>
> 6a.
> "fetch all non-template databases from the publisher and create
> subscriptions for them" -- Hmm. It's too confusing. I guess you meant
> that you are enumerating all the databases on the source server but
> then creating subscriptions on all the databases at the target server
> that have the same name (???) Anyway, the current text is not clear. I
> also think it is better to use the "target server" and "source server"
> (aka publication-server) terminology.
>
> ~
>
> 6b.
> I don't think you need to say "When using
> <option>--all-databases</option>," because this entire description for
> '--all-databases'
>
> ~
>
> 6c.
> Talks about publication and slot names but no mention of subscription names?
>
> ~
>
> 6d.
> That limitation about the 59 char limit seems a strange thing to have
> documented here. I wonder why you need to say anything at all. It
> might be better if the code just validates all this up-front before
> processing and can raise an ERROR in the very rare times this might
> happen.
>
> ~
>
> 6e.
> Spurious text junk? "replica"
>
> ~
>
> 6f.
> Don't you need to also say something like "Cannot be used with
> '--publication', '--subscription', or '--replication-slot'."
>
> ~~~
>
> 7. --database
>
> -       switches.
> +       switches. If <option>-d</option> and <option>-a</option> are specified
> +       together, <application>pg_createsubscriber</application> will return
> +       an error.
>
> A nicer way to say that would be to spell it out fully. e.g. "Cannot
> be used with --all-databases'.".
>
> ~~~
>
> 8.
> For --publication don't you need to mention "Cannot be used with
> --all-databases'.".
>
> ~~~
>
> 9.
> For --subscription don't you need to mention "Cannot be used with
> --all-databases'.".
>
> ~~~
>
> 10.
> For --replication-slot don't you need to mention "Cannot be used with
> --all-databases'.".
>
>
> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> usage:
>
> 11.
>   printf(_("\nOptions:\n"));
> + printf(_("  -a, --all-databases             fetch and specify all
> databases\n"));
>
> This doesn't really seem a helpful usage text. Should it say something
> more like "create subscriptions for all databases"... (??) **
>
> ** I have difficulty imagining what is the expected behaviour in cases
> where there might be different databases at the source server and
> target server. I think your documentation needs to have some special
> notes about this to make it clear what server you are enumerating
> these databases on, and then what happens if there is some mismatch of
> what databases are available on the *other* server.
>
> ~~~
>
> fetch_all_databases:
>
> 12.
> +/* Placeholder function to fetch all databases from the publisher */
> +static void
> +fetch_all_databases(struct CreateSubscriberOptions *opt)
> +{
>
> 12a.
> What is a "placeholder function"?
>
> ~
>
> 12b.
> I would be nicers to use the same terminology that seems more common
> in the documentation -- e.g. "source server" instead of "publisher".
> Also, the function name should make it clear if you are getting these
> data from the source server or target server.
>
> ~~~
>
> 13.
> + /* Process the query result */
> + num_rows = PQntuples(res);
> + for (int i = 0; i < num_rows; i++)
> + {
> + const char *dbname = PQgetvalue(res, i, 0);
> +
> + simple_string_list_append(&opt->database_names, dbname);
> + num_dbs++;
> + }
> The increment num_dbs++ is a bit sneaky. AFAICT you are effectively
> pretending --all-databases is the equivalent of a whole lot of
> --database so the subsequent logic won't know the difference. I think
> that logic deserves a code comment.
>
> ~~~
>
> 14.
> + /* Check if any databases were added */
> + if (opt->database_names.head == NULL)
> + {
> + pg_log_error("no database names could be fetched or specified");
> + pg_log_error_hint("Ensure that databases exist on the publisher or
> specify a database using --database option.");
> + PQclear(res);
> + PQfinish(conn);
> + exit(1);
> + }
> +
> + PQclear(res);
> + PQfinish(conn);
> +}
>
>
> 14a.
> The "were added" seems odd. I think you mean. Just "Error if no
> databases were found".
>
> ~
>
> 14b.
> Isn't it simpler just to check if num_dbs == 0?
>
> ~
>
> 14c.
> "no database names could be fetched or specified" seems like a weird
> error text. e.g. "specified"?
>
> ~
>
> 14d.
> The hint seems strange to me. If there are no databases found using
> --all-database then how will "specify a database using --database
> option." help the user get a better result?
>
> ~~~
>
> validate_databases:
>
> 15.
> +static void
> +validate_databases(struct CreateSubscriberOptions *opt)
> +{
> + /* Check for conflicting options */
> + if (opt->all_databases && opt->database_names.head != NULL)
> + {
> + pg_log_error("cannot specify --dbname when using --all-databases");
> + exit(1);
> + }
> +
> + /*
> + * Ensure publication and slot names are not manually specified with
> + * --all-databases
> + */
> + if (opt->all_databases &&
> + (opt->pub_names.head != NULL || opt->replslot_names.head != NULL))
> + {
> + pg_log_error("cannot specify --publication or --replication-slot
> when using --all-databases");
> + exit(1);
> + }
> +
>
> I think all this switch-checking is misplaced. It should be happening
> up-front in the main function.
>
> ~~~
>
> 16.
> + /* Auto-generate publication and slot names for all databases */
> + if (opt->all_databases)
> + {
> + SimpleStringListCell *cell;
> +
> + fetch_all_databases(opt);
> +
> + cell = opt->database_names.head;
> +
> + while (cell != NULL)
> + {
> + char slot_name[NAMEDATALEN];
> +
> + snprintf(slot_name, sizeof(slot_name), "%s_slot", cell->val);
> + simple_string_list_append(&opt->replslot_names, slot_name);
> +
> + snprintf(slot_name, sizeof(slot_name), "%s_pub", cell->val);
> + simple_string_list_append(&opt->pub_names, slot_name);
> +
> + cell = cell->next;
> + }
> + }
>
>
> 16a.
> Why is this code checking 'opt->all_databases'? Isn't it only possible
> to get to this function via --all-databases. You can just Assert this.
>
> ~
>
> 16b.
> Why are you reusing 'slot_name' variable like this?
>
> ~
>
> 16c.
> Was there some ERROR checking missing here to ensure the length of the
> dbname is not going to cause pub/slot to exceed NAMEDATALEN?
>
> ~
>
> 16d.
> In hindsight, most of this function seems unnecessary to me. Probably
> you could've done all this pub/slot name assignment inside the
> fetch_all_databases() at the same time as you were doing
> simple_string_list_append(&opt->database_names, dbname); for each
> database.
>
> ~~~
>
> 17.
> - while ((c = getopt_long(argc, argv, "d:D:np:P:s:t:U:v",
> + while ((c = getopt_long(argc, argv, "ad:D:np:P:s:t:U:v",
>   long_options, &option_index)) != -1)
>
>
> Isn't the code within this loop missing all the early switch
> validation for checking you are not combining incompatible switches?
>
> e.g.
> --all-databases - "Cannot specify --all-databases more than once"
> --all-databases - "Cannot combine with --database, --publication,
> --subscription, --replication-slot"
> --database - "Cannot combine with --all-databases"
> --publication - "Cannot combine with --all-databases"
> --subscription - "Cannot combine with --all-databases"
> --replication-slot - "Cannot combine with --all-databases"
>
> ~~~
>
> 18.
> +
> +
>   /* Number of object names must match number of databases */
>
> Remove spurious whitespace.
>
> ======

Fixed the given comments. The attached patch at [1] contains the
suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2BDOQ9vvkKCsmDeFKyM073yRfs47Tn_qywQOdx15pmOMA%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Some review comments for v4-0001.

======
Commit message

1.
This patch enhances the 'pg_createsubscriber' utility by adding the
'--all-databases' option, which automatically fetches all non-template
databases on the source server (publisher) and creates corresponding
subscriptions on the target server (subscriber). This simplifies the process
of converting a physical standby to a logical subscriber, particularly
during upgrades.

When '--all-databases' is used, the tool queries the source server for all
databases and attempts to create subscriptions on the target server for
databases with matching names. The tool auto-generates subscription,publication
and replication slot names using the format:
  - Subscription: '<database>_sub'
  - Publication: '<database>_pub'
  - Replication slot: '<database>_slot'

~

There seems a lot of duplication in the above 2 paragraphs.
Duplication can be removed.
BTW, those generated formats are suspicious -- see a later comment below.

SUGGESTION:
This patch enhances the 'pg_createsubscriber' utility by adding the
'--all-databases' option. When '--all-databases' is specified, the tool
queries the source server (publisher) for all databases and creates
subscriptions on the target server (subscriber) for databases with matching
names. This simplifies the process of converting a physical standby to a
logical subscriber, particularly during upgrades.

The tool auto-generates subscription, publication and replication slot
names using the format:
...

~~~

2.
The patch ensures that conflicting options such as '--all-databases' and
'--database', '--publication', '--subscription', or '--replication-slot' cannot
be used together

~

That wording seems misleading because it makes it sound like
'--publication' and '--database' cannot be used together.

SUGGESTION
The options '--database', '--publication', '--subscription', and
'--replication-slot' cannot be used when '--all-databases' is
specified.

======
doc/src/sgml/ref/pg_createsubscriber.sgml

3.
What about the synopsis (??)

v4 did not address my previous review comment [1, #5] about the synopsis

> 5.
> The synopsis has no mention of --all-databases.
>
> I'm not sure of the best way to do it -- if it becomes too messy to
> combine everything then maybe having several is the way to go:
>
> e.g.
>
> pg_createsubscriber [option...] { -a | --all-databases } { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
> pg_createsubscriber [option...] { -d | --database }dbname { -D |
> --pgdata }datadir { -P | --publisher-server }connstr
>

~~~

4.
+       Automatically fetch all non-template databases from the source server
+       and create subscriptions for databases with the same names on the
+       target server.
+       If a database is present on the source but missing on the target, it is
+       skipped or an error is raised.
+       If a database exists on the target but not on the source, no
+       subscription is created for it.

"it is skipped or an error is raised" (??) So which is it? It cannot be both.

~~~

5.
+       Subscription names, publication names, and replication slot names are
+       automatically generated using the format:
+       <literal><replaceable>database</replaceable>_sub</literal>
+       <literal><replaceable>database</replaceable>_pub</literal> and
+       <literal><replaceable>database</replaceable>_slot</literal>
respectively.
+      </para>

This seems strange to me, because according to the documentation NOTES
section when --database is used and there is no
publication/subscription name etc then all the generated name format
looks like:

------
If the --publication option is not specified, the publication has the
following name pattern: “pg_createsubscriber_%u_%x” (parameter:
database oid, random int). If the --replication-slot option is not
specified, the replication slot has the following name pattern:
“pg_createsubscriber_%u_%x” (parameters: database oid, random int).
------

and

------
If the --subscription option is not specified, the subscription has
the following name pattern: “pg_createsubscriber_%u_%x” (parameters:
database oid, random int).
------

So, why are the generated names different for '--all-databases'?
Actually, I can't see any code even doing what the new documentation
is saying so is the documentation even telling the truth? The commit
message will also be impacted.

~~~

6.
+      <para>
+       <option>--all-databases</option> cannot be used with
+       <option>--publication</option>, <option>--subscription</option>,
+       or <option>--replication-slot</option>.
+      </para>

What about "--database".

~~~

7.
For all the other incompatible options you wrote:

"Cannot be used together with <option>-a</option>."

IMO it might be nioer to give the full name of the option.
e.g. "Cannot be used together with <option>--all-databases</option>."

======
src/bin/pg_basebackup/pg_createsubscriber.c

usage:

8.
+ printf(_("  -a, --all-databases             create subscriptions for
all matching databases on the target\n"));

That seems ambiguous to me. How about something more like below to
clarify the subscription is created on the target.

"for all source databases create subscriptions on the same target database"

~~~

fetch_source_databases:

9.
+ const char *query = "SELECT datname FROM pg_database WHERE
datistemplate = false";

Do we need this variable? It seems simpler/better just to put the SQL in-line.

~~~

main:

10.
  switch (c)
  {
+ case 'a':
+ opt.all_databases = true;
+ break;
+

10a.
Missing the check for duplicate option --all-databases. OTOH maybe you
don't care about this error, because probably it is harmless if you
just ignore it.

~

10b.
Missing the check for --database, --publication, --subscription,
--replication-slot.

(e.g. if user specified the --all-databases *after* those other options)

~~~

11.
+ if (opt.all_databases)
+ {
+ pg_log_error("--replslot cannot be used with --all-databases");
+ exit(1);
+ }

Where'd this name come from? AFAICT there is no such option as "--replslot"

~~~

12.
- * If --database option is not provided, try to obtain the dbname from
- * the publisher conninfo. If dbname parameter is not available, error
- * out.
+ * If neither --database nor --all-databases option is provided, try
+ * to obtain the dbname from the publisher conninfo. If dbname
+ * parameter is not available, error out.

For this comment to make sense I think the previous comment (where
fetch_source_databases is called) needs to explain that when
--all-databases is defined then it is going to treat that internally
as the equivalent of a whole lot of user-specified --database options.

======
.../t/040_pg_createsubscriber.pl

13.
+ qr/cannot specify --publication or --replication-slot when using
--all-databases/,
+ 'fail if --all-databases is used with publication and slot');
+

How are tests expecting this even passing?

Searching the patch I cannot find any such error!

======
[1] https://www.postgresql.org/message-id/CAHut%2BPuyBsOJTSygus2-yp60sw_phwYQ-JyC%2BU6fCBMos9x2LA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



On Mon, Feb 10, 2025 at 7:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Some review comments for v4-0001.
>
> ======
> Commit message
>
> 1.
> This patch enhances the 'pg_createsubscriber' utility by adding the
> '--all-databases' option, which automatically fetches all non-template
> databases on the source server (publisher) and creates corresponding
> subscriptions on the target server (subscriber). This simplifies the process
> of converting a physical standby to a logical subscriber, particularly
> during upgrades.
>
> When '--all-databases' is used, the tool queries the source server for all
> databases and attempts to create subscriptions on the target server for
> databases with matching names. The tool auto-generates subscription,publication
> and replication slot names using the format:
>   - Subscription: '<database>_sub'
>   - Publication: '<database>_pub'
>   - Replication slot: '<database>_slot'
>
> ~
>
> There seems a lot of duplication in the above 2 paragraphs.
> Duplication can be removed.
> BTW, those generated formats are suspicious -- see a later comment below.
>
> SUGGESTION:
> This patch enhances the 'pg_createsubscriber' utility by adding the
> '--all-databases' option. When '--all-databases' is specified, the tool
> queries the source server (publisher) for all databases and creates
> subscriptions on the target server (subscriber) for databases with matching
> names. This simplifies the process of converting a physical standby to a
> logical subscriber, particularly during upgrades.
>
> The tool auto-generates subscription, publication and replication slot
> names using the format:
> ...
>
> ~~~
>
> 2.
> The patch ensures that conflicting options such as '--all-databases' and
> '--database', '--publication', '--subscription', or '--replication-slot' cannot
> be used together
>
> ~
>
> That wording seems misleading because it makes it sound like
> '--publication' and '--database' cannot be used together.
>
> SUGGESTION
> The options '--database', '--publication', '--subscription', and
> '--replication-slot' cannot be used when '--all-databases' is
> specified.
>
> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 3.
> What about the synopsis (??)
>
> v4 did not address my previous review comment [1, #5] about the synopsis
>
> > 5.
> > The synopsis has no mention of --all-databases.
> >
> > I'm not sure of the best way to do it -- if it becomes too messy to
> > combine everything then maybe having several is the way to go:
> >
> > e.g.
> >
> > pg_createsubscriber [option...] { -a | --all-databases } { -D |
> > --pgdata }datadir { -P | --publisher-server }connstr
> > pg_createsubscriber [option...] { -d | --database }dbname { -D |
> > --pgdata }datadir { -P | --publisher-server }connstr
> >
>
> ~~~
>
> 4.
> +       Automatically fetch all non-template databases from the source server
> +       and create subscriptions for databases with the same names on the
> +       target server.
> +       If a database is present on the source but missing on the target, it is
> +       skipped or an error is raised.
> +       If a database exists on the target but not on the source, no
> +       subscription is created for it.
>
> "it is skipped or an error is raised" (??) So which is it? It cannot be both.
>
> ~~~
>
> 5.
> +       Subscription names, publication names, and replication slot names are
> +       automatically generated using the format:
> +       <literal><replaceable>database</replaceable>_sub</literal>
> +       <literal><replaceable>database</replaceable>_pub</literal> and
> +       <literal><replaceable>database</replaceable>_slot</literal>
> respectively.
> +      </para>
>
> This seems strange to me, because according to the documentation NOTES
> section when --database is used and there is no
> publication/subscription name etc then all the generated name format
> looks like:
>
> ------
> If the --publication option is not specified, the publication has the
> following name pattern: “pg_createsubscriber_%u_%x” (parameter:
> database oid, random int). If the --replication-slot option is not
> specified, the replication slot has the following name pattern:
> “pg_createsubscriber_%u_%x” (parameters: database oid, random int).
> ------
>
> and
>
> ------
> If the --subscription option is not specified, the subscription has
> the following name pattern: “pg_createsubscriber_%u_%x” (parameters:
> database oid, random int).
> ------
>
> So, why are the generated names different for '--all-databases'?
> Actually, I can't see any code even doing what the new documentation
> is saying so is the documentation even telling the truth? The commit
> message will also be impacted.
>
> ~~~
>
> 6.
> +      <para>
> +       <option>--all-databases</option> cannot be used with
> +       <option>--publication</option>, <option>--subscription</option>,
> +       or <option>--replication-slot</option>.
> +      </para>
>
> What about "--database".
>
> ~~~
>
> 7.
> For all the other incompatible options you wrote:
>
> "Cannot be used together with <option>-a</option>."
>
> IMO it might be nioer to give the full name of the option.
> e.g. "Cannot be used together with <option>--all-databases</option>."
>
> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> usage:
>
> 8.
> + printf(_("  -a, --all-databases             create subscriptions for
> all matching databases on the target\n"));
>
> That seems ambiguous to me. How about something more like below to
> clarify the subscription is created on the target.
>
> "for all source databases create subscriptions on the same target database"
>
> ~~~
>
> fetch_source_databases:
>
> 9.
> + const char *query = "SELECT datname FROM pg_database WHERE
> datistemplate = false";
>
> Do we need this variable? It seems simpler/better just to put the SQL in-line.
>
> ~~~
>
> main:
>
> 10.
>   switch (c)
>   {
> + case 'a':
> + opt.all_databases = true;
> + break;
> +
>
> 10a.
> Missing the check for duplicate option --all-databases. OTOH maybe you
> don't care about this error, because probably it is harmless if you
> just ignore it.
>
> ~
>
> 10b.
> Missing the check for --database, --publication, --subscription,
> --replication-slot.
>
> (e.g. if user specified the --all-databases *after* those other options)
>
> ~~~
>
> 11.
> + if (opt.all_databases)
> + {
> + pg_log_error("--replslot cannot be used with --all-databases");
> + exit(1);
> + }
>
> Where'd this name come from? AFAICT there is no such option as "--replslot"
>
> ~~~
>
> 12.
> - * If --database option is not provided, try to obtain the dbname from
> - * the publisher conninfo. If dbname parameter is not available, error
> - * out.
> + * If neither --database nor --all-databases option is provided, try
> + * to obtain the dbname from the publisher conninfo. If dbname
> + * parameter is not available, error out.
>
> For this comment to make sense I think the previous comment (where
> fetch_source_databases is called) needs to explain that when
> --all-databases is defined then it is going to treat that internally
> as the equivalent of a whole lot of user-specified --database options.
>
> ======
> .../t/040_pg_createsubscriber.pl
>
> 13.
> + qr/cannot specify --publication or --replication-slot when using
> --all-databases/,
> + 'fail if --all-databases is used with publication and slot');
> +
>
> How are tests expecting this even passing?
>
> Searching the patch I cannot find any such error!
>
> ======

Fixed the given comments. The attached patch at [1] contains the
suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2BmhYgh8XLFM8sN8J05z0ia%2BknfB1kP6kjbnB55H-b-mw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Mon, 10 Feb 2025 at 20:36, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> The attached patch contains the suggested changes.

If a new database is created on the primary server while
pg_createsubscriber is running, the subscription will not be created
for the new database.
To reproduce this issue, follow these steps:
Debug pg_createsubscriber and set a breakpoint after the
fetch_source_databases function, where the databases would have been
prepared.
While execution of pg_createsubscriber is paused, create a new
database on the primary node.
You will observe that the database is created on the standby node, but
the subscription will not be created for the newly created database.
+fetch_source_databases(struct CreateSubscriberOptions *opt)
+{
+       PGconn     *conn;
+       PGresult   *res;
+       int                     num_rows;
+
+       /* Establish a connection to the PostgreSQL server */
+       conn = connect_database(opt->pub_conninfo_str, true);
+
+       res = PQexec(conn, "SELECT datname FROM pg_database WHERE
datistemplate = false AND datallowconn = true");
+
+       /* Check for errors during query execution */
+       if (PQresultStatus(res) != PGRES_TUPLES_OK)

Regards,
Vignesh



Hi Shubham.

Responding with a blanket statement "Fixed the given comments" makes
it difficult to be sure what was done when I come to confirm the
changes. Often embedded questions go unanswered, and if changes are
*not* made, then I can't tell if there was a good reason for rejection
or was the comment just overlooked. Please can you treat the itemised
feedback as a checklist and reply individually to each item. Even just
saying "Fixed" is useful because then I can trust the item was seen
and addressed.

e.g. From previous review [1]
#7. Not fixed. The same docs issue still exists for --publication and
for --subscription.
#13. Unanswered question "How are tests expecting this even passing?".
Was a reason identified? IOW, how can we be sure the latest tests
don't have a similar problem?

//////

Some more review comments for the patch v5-0001.

======
doc/src/sgml/ref/pg_createsubscriber.sgml

Synopsis

1.
pg_createsubscriber [option...] { -a | --all-databases } { -d |
--database }dbname { -D | --pgdata }datadir { -P | --publisher-server
}connstr

This looks misleading because '-all-database' and '--database' cannot
be combined.

I think it will be better to have 2 completely different synopsis like
I had originally suggested [2, comment #5]. E.g. just add a whole
seperate <cmdsynopsis> block adjacent to the other one in the SGML:

------
  <cmdsynopsis>
   <command>pg_createsubscriber</command>
   <arg rep="repeat"><replaceable>option</replaceable></arg>
   <group choice="plain">
    <group choice="req">
     <arg choice="plain"><option>-a</option></arg>
     <arg choice="plain"><option>--all-databases</option></arg>
    </group>
    <group choice="req">
     <arg choice="plain"><option>-D</option> </arg>
     <arg choice="plain"><option>--pgdata</option></arg>
    </group>
    <replaceable>datadir</replaceable>
    <group choice="req">
     <arg choice="plain"><option>-P</option></arg>
     <arg choice="plain"><option>--publisher-server</option></arg>
    </group>
    <replaceable>connstr</replaceable>
   </group>
  </cmdsynopsis>
------

~~~

2. --all-databases

+      <para>
+       <option>--all-databases</option> cannot be used with
+       <option>--database</option>, <option>--publication</option>,
+       <option>--replication-slot</option> or <option>--subscription</option>.
+      </para>

If you want consistent wording with the other places in this patch,
then this should just be the last sentence of the previous paragraph
and be worded like: "Cannot be used together with..."

~~~

3. --publication

-       a generated name is assigned to the publication name.
+       a generated name is assigned to the publication name. Cannot be used
+       together with <option>-a</option>.

Previously reported. Claimed fixed -a to --all-databases. Not fixed.

~~~

4. --subscription

-       a generated name is assigned to the subscription name.
+       a generated name is assigned to the subscription name. Cannot be used
+       together with <option>-a</option>.


(same as the previous review comment).

Previously reported. Claimed fixed -a to --all-databases. Not fixed.

======
src/bin/pg_basebackup/pg_createsubscriber.c

5.
+ bool all_databases; /* fetch and specify all databases */

Comment wording. "and specified" (??). Maybe just "--all-databases
option was specified"

======
[1] https://www.postgresql.org/message-id/CAHut%2BPuYmTyi6kPFnJDTvD%3DaLHd0kyX4VG6iDQVEk-ixov1AwA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAHut%2BPuyBsOJTSygus2-yp60sw_phwYQ-JyC%2BU6fCBMos9x2LA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



On Tue, Feb 11, 2025 at 9:16 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> > #13. Unanswered question "How are tests expecting this even passing?".
> > Was a reason identified? IOW, how can we be sure the latest tests
> > don't have a similar problem?
> >
>
> In the v4-0001 patch [1], the tests were mistakenly using
> 'command_fails' instead of 'command_fails_like' to verify failed test
> cases. Since 'command_fails' only checks for a non-zero exit code and
> not specific error messages, the tests were passing even when the
> expected errors were not being triggered correctly.
> To address this, I have updated the patch to use 'command_fails_like',
> ensuring that the test cases now explicitly verify the correct failure
> messages.
>

Ah, that makes sense. Thanks for sharing the reason. So in fact, it
was a valid concern because the v5 was still carrying over the same
flaw from v4... Anyway, it is good to know it is fixed now in v6.

=====

Some general comments for the patch v6-0001:

Do you need to test every possible bad option combination? It may be
fine because the error will be immediately raised so I expect the
execution overhead to be ~zero.

BTW, your bad option combination tests are only using  --all-databases
*after* the other options. Maybe you should mix it up a bit, sometimes
putting it *before* the others as well, because rearranging will cause
different errors.

Everything else now looks good to me.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Dear Shubham,

Thanks for updating the patch! I feel the patch has good shape. Here is a small comment.

```
+       /* Error if no databases were found on the source server */
+       if (num_rows == 0)
+       {
+               pg_log_error("no databases found on the source server");
+               pg_log_error_hint("Ensure that there are user-created databases on the source server.");
+               PQclear(res);
+               disconnect_database(conn, false);
+               exit(1);
+       }
```

I think the error message is not accurate. This error can happen when there are
user-created database but it is created as template. How about below:

```
-               pg_log_error("no databases found on the source server");
-               pg_log_error_hint("Ensure that there are user-created databases on the source server.");
+               pg_log_error("no convertable databases found on the source server");
+               pg_log_error_hint("Ensure that there are non-template and connectable databases on the source
server.");
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Wed, Feb 12, 2025 at 10:58 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, Feb 11, 2025 at 9:16 PM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > > #13. Unanswered question "How are tests expecting this even passing?".
> > > Was a reason identified? IOW, how can we be sure the latest tests
> > > don't have a similar problem?
> > >
> >
> > In the v4-0001 patch [1], the tests were mistakenly using
> > 'command_fails' instead of 'command_fails_like' to verify failed test
> > cases. Since 'command_fails' only checks for a non-zero exit code and
> > not specific error messages, the tests were passing even when the
> > expected errors were not being triggered correctly.
> > To address this, I have updated the patch to use 'command_fails_like',
> > ensuring that the test cases now explicitly verify the correct failure
> > messages.
> >
>
> Ah, that makes sense. Thanks for sharing the reason. So in fact, it
> was a valid concern because the v5 was still carrying over the same
> flaw from v4... Anyway, it is good to know it is fixed now in v6.
>

FYI, I've proposed a patch [1] that can prevent this mistake from
happening in future.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPu-umU%3DCkcSvrqL9eQt85Rv_qfLtekqU-yrjfh%3DGirUQg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.



On Tue, 11 Feb 2025 at 15:46, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Tue, Feb 11, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Shubham.
> >
> > Responding with a blanket statement "Fixed the given comments" makes
> > it difficult to be sure what was done when I come to confirm the
> > changes. Often embedded questions go unanswered, and if changes are
> > *not* made, then I can't tell if there was a good reason for rejection
> > or was the comment just overlooked. Please can you treat the itemised
> > feedback as a checklist and reply individually to each item. Even just
> > saying "Fixed" is useful because then I can trust the item was seen
> > and addressed.
> >
>
> I appreciate the feedback. I will ensure that each item is addressed
> individually.
>
> > e.g. From previous review [1]
> > #7. Not fixed. The same docs issue still exists for --publication and
> > for --subscription.
>
> Fixed this issue in the attached patch.
>
> > #13. Unanswered question "How are tests expecting this even passing?".
> > Was a reason identified? IOW, how can we be sure the latest tests
> > don't have a similar problem?
> >
>
> In the v4-0001 patch [1], the tests were mistakenly using
> 'command_fails' instead of 'command_fails_like' to verify failed test
> cases. Since 'command_fails' only checks for a non-zero exit code and
> not specific error messages, the tests were passing even when the
> expected errors were not being triggered correctly.
> To address this, I have updated the patch to use 'command_fails_like',
> ensuring that the test cases now explicitly verify the correct failure
> messages.
>
> > //////
> >
> > Some more review comments for the patch v5-0001.
> >
> > ======
> > doc/src/sgml/ref/pg_createsubscriber.sgml
> >
> > Synopsis
> >
> > 1.
> > pg_createsubscriber [option...] { -a | --all-databases } { -d |
> > --database }dbname { -D | --pgdata }datadir { -P | --publisher-server
> > }connstr
> >
> > This looks misleading because '-all-database' and '--database' cannot
> > be combined.
> >
> > I think it will be better to have 2 completely different synopsis like
> > I had originally suggested [2, comment #5]. E.g. just add a whole
> > seperate <cmdsynopsis> block adjacent to the other one in the SGML:
> >
> > ------
> >   <cmdsynopsis>
> >    <command>pg_createsubscriber</command>
> >    <arg rep="repeat"><replaceable>option</replaceable></arg>
> >    <group choice="plain">
> >     <group choice="req">
> >      <arg choice="plain"><option>-a</option></arg>
> >      <arg choice="plain"><option>--all-databases</option></arg>
> >     </group>
> >     <group choice="req">
> >      <arg choice="plain"><option>-D</option> </arg>
> >      <arg choice="plain"><option>--pgdata</option></arg>
> >     </group>
> >     <replaceable>datadir</replaceable>
> >     <group choice="req">
> >      <arg choice="plain"><option>-P</option></arg>
> >      <arg choice="plain"><option>--publisher-server</option></arg>
> >     </group>
> >     <replaceable>connstr</replaceable>
> >    </group>
> >   </cmdsynopsis>
> > ------
> >
> > ~~~
> >
>
> Fixed.
>
> > 2. --all-databases
> >
> > +      <para>
> > +       <option>--all-databases</option> cannot be used with
> > +       <option>--database</option>, <option>--publication</option>,
> > +       <option>--replication-slot</option> or <option>--subscription</option>.
> > +      </para>
> >
> > If you want consistent wording with the other places in this patch,
> > then this should just be the last sentence of the previous paragraph
> > and be worded like: "Cannot be used together with..."
> >
> > ~~~
> >
>
> Fixed.
>
> > 3. --publication
> >
> > -       a generated name is assigned to the publication name.
> > +       a generated name is assigned to the publication name. Cannot be used
> > +       together with <option>-a</option>.
> >
> > Previously reported. Claimed fixed -a to --all-databases. Not fixed.
> >
> > ~~~
> >
>
> Fixed.
>
> > 4. --subscription
> >
> > -       a generated name is assigned to the subscription name.
> > +       a generated name is assigned to the subscription name. Cannot be used
> > +       together with <option>-a</option>.
> >
> >
> > (same as the previous review comment).
> >
> > Previously reported. Claimed fixed -a to --all-databases. Not fixed.
> >
>
> Fixed.
>
> > ======
> > src/bin/pg_basebackup/pg_createsubscriber.c
> >
> > 5.
> > + bool all_databases; /* fetch and specify all databases */
> >
> > Comment wording. "and specified" (??). Maybe just "--all-databases
> > option was specified"
> >
> > ======
>
> Fixed.
>
> The attached Patch contains the suggested changes.
>
> [1] - https://www.postgresql.org/message-id/CAHv8Rj%2BDOQ9vvkKCsmDeFKyM073yRfs47Tn_qywQOdx15pmOMA%40mail.gmail.com
>

Hi Shubham,

I reviewed v6 patch, here are some of my comments:

1. Option 'P' is for "publisher-server".
            case 'P':
+               if (opt.all_databases)
+               {
+                   pg_log_error("--publication cannot be used with
--all-databases");
+                   exit(1);
+               }

So, if we provide the "--all-databases" option and then the
"--publisher-server" option. Then the command pg_createsubscriber is
failing, which is wrong.

2. In case we provide "--all-database" option and then "--publication"
option, the pg_createsubscriber command is running. I think the above
condition should be added at "case 2:" instead of "case 'P':"

Thanks and Regards,
Shlok Kyal



On Wed, Feb 12, 2025 at 6:46 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch! I feel the patch has good shape. Here is a small comment.
>
> ```
> +       /* Error if no databases were found on the source server */
> +       if (num_rows == 0)
> +       {
> +               pg_log_error("no databases found on the source server");
> +               pg_log_error_hint("Ensure that there are user-created databases on the source server.");
> +               PQclear(res);
> +               disconnect_database(conn, false);
> +               exit(1);
> +       }
> ```
>
> I think the error message is not accurate. This error can happen when there are
> user-created database but it is created as template. How about below:
>
> ```
> -               pg_log_error("no databases found on the source server");
> -               pg_log_error_hint("Ensure that there are user-created databases on the source server.");
> +               pg_log_error("no convertable databases found on the source server");
> +               pg_log_error_hint("Ensure that there are non-template and connectable databases on the source
server.");
> ```
>

Fixed the given comment. The attached patch at [1] contains the
suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjKc6LMJ86b4yCmwNTn0c65mz0BGMCF-vPJSKDMOGagVGA%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Wed, Feb 12, 2025 at 1:15 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Wed, Feb 12, 2025 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Tue, Feb 11, 2025 at 9:16 PM Shubham Khanna
> > <khannashubham1197@gmail.com> wrote:
> > >
> > > > #13. Unanswered question "How are tests expecting this even passing?".
> > > > Was a reason identified? IOW, how can we be sure the latest tests
> > > > don't have a similar problem?
> > > >
> > >
> > > In the v4-0001 patch [1], the tests were mistakenly using
> > > 'command_fails' instead of 'command_fails_like' to verify failed test
> > > cases. Since 'command_fails' only checks for a non-zero exit code and
> > > not specific error messages, the tests were passing even when the
> > > expected errors were not being triggered correctly.
> > > To address this, I have updated the patch to use 'command_fails_like',
> > > ensuring that the test cases now explicitly verify the correct failure
> > > messages.
> > >
> >
> > Ah, that makes sense. Thanks for sharing the reason. So in fact, it
> > was a valid concern because the v5 was still carrying over the same
> > flaw from v4... Anyway, it is good to know it is fixed now in v6.
> >
> > =====
> >
> > Some general comments for the patch v6-0001:
> >
> > Do you need to test every possible bad option combination? It may be
> > fine because the error will be immediately raised so I expect the
> > execution overhead to be ~zero.
> >
> > BTW, your bad option combination tests are only using  --all-databases
> > *after* the other options. Maybe you should mix it up a bit, sometimes
> > putting it *before* the others as well, because rearranging will cause
> > different errors.
> >
> > Everything else now looks good to me.
>

Fixed the comment given by Shlok at [1]. The attached patch at [2]
contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CANhcyEXGwFeQd2fuB2txm1maCC2zKyROUR5exEMGzPYdrZ4CPQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHv8RjKc6LMJ86b4yCmwNTn0c65mz0BGMCF-vPJSKDMOGagVGA%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Some review comments for v7-0001

======
src/bin/pg_basebackup/pg_createsubscriber.c

1.
+ /* Error if no databases were found on the source server */
+ if (num_rows == 0)
+ {
+ pg_log_error("no convertable databases found on the source server");
+ pg_log_error_hint("Ensure that there are non-template and
connectable databases on the source server.");
+ PQclear(res);
+ disconnect_database(conn, false);
+ exit(1);
+ }

1a.
The spelling is "convertible", not "convertable"

~

1b.
What does "convertible databases" mean here, and will that term make
sense to a user? How are we "converting" the source server databases;
AFAIK we are simply connecting to them and adding publications.

IMO a better choice of adjectives can be found below.
"no suitable databases found..."
"no appropriate databases found..."
"no eligible databases found..."
etc.

======
Kind Regards,
Peter Smith.
Fujitsu Australia.



Hi Shubham,
Here are some review comments from my side


On Thu, Feb 13, 2025 at 8:58 AM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
> The attached patch contains the required changes.
>

clusterdb, vacuumdb use -a and -all for all databases. Do we want to
use the same long option name here? Probably they don't have
-databases in the long option name since db is already in their name.
pg_createsubscriber doesn't have db in its name. But still it might be
better to have just -all to be consistent with those utilities. Sorry
if this has already been discussed.

+ printf(_(" -a, --all-databases create subscriptions on the target
for all source databases\n"));

Suggest "create subscriptions for all non-template source databases".
"on the target" seems unnecessary, it's implicit.

+
+# run pg_createsubscriber with '--database' and '--all-databases' and
verify the
+# failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--database' => $db1,
+ '--all-databases',
+ ],
+ qr/--all-databases cannot be used with --database/,
+ 'fail if --all-databases is used with --database');

Need a test where --all-databases is specified before --database.

All the tests run in --dry-mode which won't test whether all the
replication slots, subscriptions and publications are working well
when --all-databases is specified. I think we need to test non-dry
mode creation. I may go as far as suggesting to split the current test
file into 3, 1. tests sanity of command line options 2. tests single
subscription creation thoroughly along with multiple --database
specifications 3. tests --all-databases health of the subscriptions
and --all-databases specific scenarios like non-template databases
aren't subscribed to.

--
Best Wishes,
Ashutosh Bapat



On Mon, 10 Feb 2025 at 22:14, vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 10 Feb 2025 at 20:36, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > The attached patch contains the suggested changes.
>
> If a new database is created on the primary server while
> pg_createsubscriber is running, the subscription will not be created
> for the new database.
> To reproduce this issue, follow these steps:
> Debug pg_createsubscriber and set a breakpoint after the
> fetch_source_databases function, where the databases would have been
> prepared.
> While execution of pg_createsubscriber is paused, create a new
> database on the primary node.
> You will observe that the database is created on the standby node, but
> the subscription will not be created for the newly created database.
> +fetch_source_databases(struct CreateSubscriberOptions *opt)
> +{
> +       PGconn     *conn;
> +       PGresult   *res;
> +       int                     num_rows;
> +
> +       /* Establish a connection to the PostgreSQL server */
> +       conn = connect_database(opt->pub_conninfo_str, true);
> +
> +       res = PQexec(conn, "SELECT datname FROM pg_database WHERE
> datistemplate = false AND datallowconn = true");
> +
> +       /* Check for errors during query execution */
> +       if (PQresultStatus(res) != PGRES_TUPLES_OK)

If we don't have a solution for this, how about the following approach:
a) Add a note in the documentation indicating that the publication and
subscriptions must be created manually for any new databases created
while pg_createsubscriber is running. b) At the end of the
pg_createsubscriber execution, identify any new databases that were
created and list them for the user.

Since this scenario might not be common, we could either implement
just option a) or both a) and b).

Regards,
Vignesh



Dear Vignesh,

> > If a new database is created on the primary server while
> > pg_createsubscriber is running, the subscription will not be created
> > for the new database.
> > To reproduce this issue, follow these steps:
> > Debug pg_createsubscriber and set a breakpoint after the
> > fetch_source_databases function, where the databases would have been
> > prepared.
> > While execution of pg_createsubscriber is paused, create a new
> > database on the primary node.
> > You will observe that the database is created on the standby node, but
> > the subscription will not be created for the newly created database.
> > +fetch_source_databases(struct CreateSubscriberOptions *opt)
> > +{
> > +       PGconn     *conn;
> > +       PGresult   *res;
> > +       int                     num_rows;
> > +
> > +       /* Establish a connection to the PostgreSQL server */
> > +       conn = connect_database(opt->pub_conninfo_str, true);
> > +
> > +       res = PQexec(conn, "SELECT datname FROM pg_database WHERE
> > datistemplate = false AND datallowconn = true");
> > +
> > +       /* Check for errors during query execution */
> > +       if (PQresultStatus(res) != PGRES_TUPLES_OK)
> 
> If we don't have a solution for this, how about the following approach:
> a) Add a note in the documentation indicating that the publication and
> subscriptions must be created manually for any new databases created
> while pg_createsubscriber is running. b) At the end of the
> pg_createsubscriber execution, identify any new databases that were
> created and list them for the user.
> 
> Since this scenario might not be common, we could either implement
> just option a) or both a) and b).

I'm not sure we must really handle the case. Documentation [1] has already described
not to run DDL commands during the command:

```
Since DDL commands are not replicated by logical replication, avoid executing DDL
commands that change the database schema while running pg_createsubscriber.
If the target server has already been converted to logical replica, the DDL commands
might not be replicated, which might cause an error.
```

Thought?

[1]: https://www.postgresql.org/docs/devel/app-pgcreatesubscriber.html

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Dear Shubham,

Thanks for updating the patch. Few comments.

```
+       If a database is present on the source but missing on the target, an
+       error is raised.
```

I'm not sure this description is accurate. I feel there is a case the command can
succeed.

Assuming that there are three databases (db1, db2, and db3) on the primary but db3
is missing on the standby when DBA is running the command. In check_subscriber(),
pg_createsubscriber connects to the db1 and it says OK. Then we wait until all
changes are replicated, thus db3 is created on the target. Everything may go well afterward.

Possible handling is as follows:

1) Ignore the corner case and retain,
2) Remove the description,
3) Modify the description, or
4) Modify codes to reject when the speficied database is missing on the target.

I feel 2) may be enough, but I want to ask others as well.


```
+       If a database exists on the target but not on the source, no
+       subscription is created for it.
```

I do not think this is needed. If the database exists on the target but not on the
source, IIUC this means the DROP DATABASE was executed on the primary server but
not replicated to the standby yet. In this case such databases would be dropped
during the command because we ensure all changes are replicated before the promotion.
Tell me if there are other situations...

```
+               disconnect_database(conn, false);
+               exit(1);
```

This combination can be changed to disconnect_database(conn, true).

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Thursday, February 13, 2025 11:28 AM Shubham Khanna <khannashubham1197@gmail.com> wrote:

Hi,

> The attached patch contains the required changes.

Thanks for updating the patch. I think it's a useful feature.

Here are few review comments:

1.

+                if (opt.all_databases)
+                {
+                    pg_log_error("--all-databases specified multiple times");
+                    exit(1);
+                }

The check for redundant --all-databases usage seems unnecessary as multiple
specifications do not cause harm or confusion for users. Similar server
commands with an --all option (such as clusterdb/vacuumdb/reindexdb) do not
report errors for it.

2.
+    while ((c = getopt_long(argc, argv, "ad:D:np:P:s:t:U:v",
                             long_options, &option_index)) != -1)
...
+                if (num_dbs > 0)
+                {
+                    pg_log_error("--all-databases cannot be used with --database");
+                    exit(1);
+                }


I think the check for similar wrong combinations should not be done inside the
getopt_long() block. It should be performed after parsing to ensure all
cases are handled which would also be consistent with the methodology followed in
all other commands AFAICS.

Maybe we can write the error message in the format "%s cannot be used with %s"
to reduce the translation workload. 

To be consistent with other error message for wrong option specification in this
command, consider adding pg_log_error_hint("Try \"%s --help\" for more information.").

3.

Ashutosh noted that commands like clusterdb and vacuumdb use the "--all" option
to indicate all databases, and I also found that the same is true for
pg_amcheck command, so I also think it's OK to use "-all" here.

Best Regards,
Hou zj

On Tue, 18 Feb 2025 at 06:22, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> I'm not sure we must really handle the case. Documentation [1] has already described
> not to run DDL commands during the command:
>
> ```
> Since DDL commands are not replicated by logical replication, avoid executing DDL
> commands that change the database schema while running pg_createsubscriber.
> If the target server has already been converted to logical replica, the DDL commands
> might not be replicated, which might cause an error.
> ```

Yes, that makes sense. Since the same applies to tables created while
pg_createsubscriber is running, and there is already generic
documentation covering this, I agree with you.

Regards,
Vignesh



On Tue, Feb 18, 2025 at 7:32 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch. Few comments.
>
> ```
> +       If a database is present on the source but missing on the target, an
> +       error is raised.
> ```
>
> I'm not sure this description is accurate. I feel there is a case the command can
> succeed.
>
> Assuming that there are three databases (db1, db2, and db3) on the primary but db3
> is missing on the standby when DBA is running the command. In check_subscriber(),
> pg_createsubscriber connects to the db1 and it says OK. Then we wait until all
> changes are replicated, thus db3 is created on the target. Everything may go well afterward.
>
> Possible handling is as follows:
>
> 1) Ignore the corner case and retain,
> 2) Remove the description,
> 3) Modify the description, or
> 4) Modify codes to reject when the speficied database is missing on the target.
>
> I feel 2) may be enough, but I want to ask others as well.
>

I agree with you on this. Since there are cases where the command can
still succeed even if a database is initially missing on the target,
the description could be misleading. Based on this, I have removed
that part of the description.

>
> ```
> +       If a database exists on the target but not on the source, no
> +       subscription is created for it.
> ```
>
> I do not think this is needed. If the database exists on the target but not on the
> source, IIUC this means the DROP DATABASE was executed on the primary server but
> not replicated to the standby yet. In this case such databases would be dropped
> during the command because we ensure all changes are replicated before the promotion.
> Tell me if there are other situations...
>

I agree with your reasoning. Since any databases that exist on the
target but not on the source would be dropped as part of ensuring all
changes are replicated before promotion, explicitly mentioning this
scenario isn’t necessary. Based on this, I have removed that part of
the description.

> ```
> +               disconnect_database(conn, false);
> +               exit(1);
> ```
>
> This combination can be changed to disconnect_database(conn, true).
>

Fixed.

The attached patch at [1] contains the suggested changes.
[1] - https://www.postgresql.org/message-id/CAHv8RjKAdrrt3-pF6yHb5gBricB9%3DD7O47Dxe39zRxKkShdpmw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Tue, Feb 18, 2025 at 2:43 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, February 13, 2025 11:28 AM Shubham Khanna <khannashubham1197@gmail.com> wrote:
>
> Hi,
>
> > The attached patch contains the required changes.
>
> Thanks for updating the patch. I think it's a useful feature.
>
> Here are few review comments:
>
> 1.
>
> +                               if (opt.all_databases)
> +                               {
> +                                       pg_log_error("--all-databases specified multiple times");
> +                                       exit(1);
> +                               }
>
> The check for redundant --all-databases usage seems unnecessary as multiple
> specifications do not cause harm or confusion for users. Similar server
> commands with an --all option (such as clusterdb/vacuumdb/reindexdb) do not
> report errors for it.
>

I agree with your point. Since multiple specifications of '--all' do
not cause any issues or confusion, and other similar commands (like
clusterdb, vacuumdb, and reindexdb) do not enforce this restriction, I
have removed the test case accordingly.

> 2.
> +       while ((c = getopt_long(argc, argv, "ad:D:np:P:s:t:U:v",
>                                                         long_options, &option_index)) != -1)
> ...
> +                               if (num_dbs > 0)
> +                               {
> +                                       pg_log_error("--all-databases cannot be used with --database");
> +                                       exit(1);
> +                               }
>
>
> I think the check for similar wrong combinations should not be done inside the
> getopt_long() block. It should be performed after parsing to ensure all
> cases are handled which would also be consistent with the methodology followed in
> all other commands AFAICS.
>

I agree with your suggestion. To ensure consistency with other
commands, I have moved the check for conflicting options outside the
getopt_long() block. This allows us to handle all cases after parsing
is complete.

> Maybe we can write the error message in the format "%s cannot be used with %s"
> to reduce the translation workload.
>

Fixed.

> To be consistent with other error message for wrong option specification in this
> command, consider adding pg_log_error_hint("Try \"%s --help\" for more information.").
>

Fixed.

> 3.
>
> Ashutosh noted that commands like clusterdb and vacuumdb use the "--all" option
> to indicate all databases, and I also found that the same is true for
> pg_amcheck command, so I also think it's OK to use "-all" here.
>

Fixed.

The attached patch at [1] contains the suggested changes.
[1] - https://www.postgresql.org/message-id/CAHv8RjKAdrrt3-pF6yHb5gBricB9%3DD7O47Dxe39zRxKkShdpmw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Dear Shubham,

Thanks for updating the patch!

> I have added a test case for non-dry-run mode to ensure that
> replication slots, subscriptions, and publications work as expected
> when '--all' is specified. Additionally, I have split the test file
> into two parts:
> 1) Command-line sanity checks – Validates the correctness of option parsing.
> 2) '--all' functionality and behavior – Verifies the health of
> subscriptions and ensures that --all specific scenarios, such as
> non-template databases not being subscribed to, are properly handled.
> This should improve test coverage while keeping the structure manageable.

TBH, I feel your change does not separate the test file into the two parts. ISTM
you just added some validation checks and a test how --all option works.
Ashutosh, does it match your suggestion?

Anyway, here are my comments.

01.
```
+    int            num_rows;
```

I think num_rows is not needed, we can do like:

```
        /* Process the query result */
-       num_rows = PQntuples(res);
-       for (int i = 0; i < num_rows; i++)
+       for (int i = 0; i < PQntuples(res); i+
```
And
```
        /* Error if no databases were found on the source server */
-       if (num_rows == 0)
+       if (num_dbs == 0)
```

02.
```
+    opt.all = false;
```

This must be done after initializing recovery_timeout.

03.
```
+# Set up node S1 as standby linking to node P
+$node_p->backup('backup_3');
+my $node_s1 = PostgreSQL::Test::Cluster->new('node_s1');
```

I do not like the name "S1" because this is the second standby sever
from the node_p.

04.
```
+# Verify that only user databases got subscriptions (not template databases)
+my @user_dbs = ($db1, $db2);
+foreach my $dbname (@user_dbs)
+{
+    my $sub_exists =
+      $node_s1->safe_psql($dbname, "SELECT count(*) FROM pg_subscription;");
+    is($sub_exists, '3', "Subscription created successfully for $dbname");
+}
```

Hmm, what do you want to check here? pg_subscription is a global catalog so that
very loop will see exactly the same content. Also, 'postgres' is also the user database.
I feel you must ensure that all three databases (postgres, $db1, and $db2) have a
subscription here.

05.
```
+# Verify replication is working
+$node_p->safe_psql($db1, "INSERT INTO tbl1 VALUES ('replication test');");
+
+$result = $node_s1->safe_psql($db1, "SELECT * FROM tbl1 ORDER BY 1");
+is( $result, qq(first row
+replication test
+second row), "replication successful in $db1");
+
+$node_s1->stop;
 # Run pg_createsubscriber on node S.  --verbose is used twice
 # to show more information.
 command_ok(
```

I'm not sure the part is not needed. This have already been checked in other parts, right?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Hi Shubham, Here are some review comments for v9-0001.

======
Commit message

1.
You've changed the option to '--all' but the comment message still
refers to '-all-databases'

======
doc/src/sgml/ref/pg_createsubscriber.sgml

2.
+      <para>
+       Automatically fetch all non-template databases from the source server
+       and create subscriptions for databases with the same names on the
+       target server.

I don't see what is "automatic" about this, nor do I know really what
"fetch" of databases is trying to convey.

Isn't it simpler to just say like below?

SUGGESTION
For all source server non-template databases create subscriptions for...

======
src/bin/pg_basebackup/pg_createsubscriber.c

3.
  int recovery_timeout; /* stop recovery after this time */
+ bool all; /* --all option was specified */
 };


IMO 'all' is too generic for a field name; it has almost no meaning --
all what? I think the original 'all_databases' name was better. Or
'all_dbs', but not just 'all'.

~~~

4.
+ if (opt.all)
+ {
+ if (num_dbs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--database");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_pubs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--publication");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_replslots > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--replication-slot");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_subs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--subscription");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }

4a
If bad combinations exist then IMO it is most likely that the --all
came before the incompatible option. So the parameters should be the
other way around so the result is "--<XXX> cannot be used with --all".

~

4b.
You could eliminate all this code duplication by using a variable for
the bad option,

SUGGESTION
char *bad_switch = NULL;

if (num_dbs > 0)            bad_switch = "--database";
else if (num_pubs > 0)      bad_switch = "--publication";
else if (num_replslots > 0) bad_switch = "--replication_slot";
else if (num_subs > 0)      bad_switch = "--subscription";

if (bad_switch)
{
  pg_log_error("%s cannot be used with --all", bad_switch);
  pg_log_error_hint("Try \"%s --help\" for more information.", progname);
  exit(1);
}

~

4c.
There is a double-blank line after this code fragment.

======
.../t/040_pg_createsubscriber.pl

5.
+# run pg_createsubscriber with '--database' and '--all' and verify the
+# failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--database' => $db1,
+ '--all',
+ ],
+ qr/--all cannot be used with --database/,
+ 'fail if --all is used with --database');
+

There is no difference anymore in the logic/error if the options are
given in order "--all --database"  or if they are in order "--database
--all". So you no longer need to have separate test cases for
different ordering.

~~~

6.
+# Verify that only user databases got subscriptions (not template databases)
+my @user_dbs = ($db1, $db2);
+foreach my $dbname (@user_dbs)
+{
+ my $sub_exists =
+   $node_s1->safe_psql($dbname, "SELECT count(*) FROM pg_subscription;");
+ is($sub_exists, '3', "Subscription created successfully for $dbname");
+}

AFAICT this is just checking a global subscription count in a loop so
it will be 3, 3, 3. Why?

I guess you intended to verify that relevant subscriptions were
created for each of the target databases. But this code is not doing
that.

~~~

7.
+# Verify replication is working
+$node_p->safe_psql($db1, "INSERT INTO tbl1 VALUES ('replication test');");
+
+$result = $node_s1->safe_psql($db1, "SELECT * FROM tbl1 ORDER BY 1");
+is( $result, qq(first row
+replication test
+second row), "replication successful in $db1");
+
+$node_s1->stop;
 # Run pg_createsubscriber on node S.  --verbose is used twice
 # to show more information.
 command_ok(
@@ -431,8 +590,9 @@ $result = $node_s->safe_psql($db1,
 is($result, qq(0), 'failover slot was removed');

 # Check result in database $db1
-$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1');
+$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1 ORDER BY 1');
 is( $result, qq(first row
+replication test

I concur with Kuroda-san's comment. Maybe remove all this, because
AFAICT existing code was already testing that replication is working
ok for $db1 and $db2.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Wed, Feb 19, 2025 at 4:23 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch!
>
> > I have added a test case for non-dry-run mode to ensure that
> > replication slots, subscriptions, and publications work as expected
> > when '--all' is specified. Additionally, I have split the test file
> > into two parts:
> > 1) Command-line sanity checks – Validates the correctness of option parsing.
> > 2) '--all' functionality and behavior – Verifies the health of
> > subscriptions and ensures that --all specific scenarios, such as
> > non-template databases not being subscribed to, are properly handled.
> > This should improve test coverage while keeping the structure manageable.
>
> TBH, I feel your change does not separate the test file into the two parts. ISTM
> you just added some validation checks and a test how --all option works.
> Ashutosh, does it match your suggestion?
>
> Anyway, here are my comments.
>
> 01.
> ```
> +       int                     num_rows;
> ```
>
> I think num_rows is not needed, we can do like:
>
> ```
>         /* Process the query result */
> -       num_rows = PQntuples(res);
> -       for (int i = 0; i < num_rows; i++)
> +       for (int i = 0; i < PQntuples(res); i+
> ```
> And
> ```
>         /* Error if no databases were found on the source server */
> -       if (num_rows == 0)
> +       if (num_dbs == 0)
> ```
>

Fixed.

> 02.
> ```
> +       opt.all = false;
> ```
>
> This must be done after initializing recovery_timeout.
>

Fixed.

> 03.
> ```
> +# Set up node S1 as standby linking to node P
> +$node_p->backup('backup_3');
> +my $node_s1 = PostgreSQL::Test::Cluster->new('node_s1');
> ```
>
> I do not like the name "S1" because this is the second standby sever
> from the node_p.
>

Fixed.

> 04.
> ```
> +# Verify that only user databases got subscriptions (not template databases)
> +my @user_dbs = ($db1, $db2);
> +foreach my $dbname (@user_dbs)
> +{
> +       my $sub_exists =
> +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM pg_subscription;");
> +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> +}
> ```
>
> Hmm, what do you want to check here? pg_subscription is a global catalog so that
> very loop will see exactly the same content. Also, 'postgres' is also the user database.
> I feel you must ensure that all three databases (postgres, $db1, and $db2) have a
> subscription here.
>

Fixed.

> 05.
> ```
> +# Verify replication is working
> +$node_p->safe_psql($db1, "INSERT INTO tbl1 VALUES ('replication test');");
> +
> +$result = $node_s1->safe_psql($db1, "SELECT * FROM tbl1 ORDER BY 1");
> +is( $result, qq(first row
> +replication test
> +second row), "replication successful in $db1");
> +
> +$node_s1->stop;
>  # Run pg_createsubscriber on node S.  --verbose is used twice
>  # to show more information.
>  command_ok(
> ```
>
> I'm not sure the part is not needed. This have already been checked in other parts, right?
>

Fixed.

The attached patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Thu, Feb 20, 2025 at 1:23 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shubham, Here are some review comments for v9-0001.
>
> ======
> Commit message
>
> 1.
> You've changed the option to '--all' but the comment message still
> refers to '-all-databases'
>

Fixed.

> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 2.
> +      <para>
> +       Automatically fetch all non-template databases from the source server
> +       and create subscriptions for databases with the same names on the
> +       target server.
>
> I don't see what is "automatic" about this, nor do I know really what
> "fetch" of databases is trying to convey.
>
> Isn't it simpler to just say like below?
>
> SUGGESTION
> For all source server non-template databases create subscriptions for...
>

Fixed.

> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> 3.
>   int recovery_timeout; /* stop recovery after this time */
> + bool all; /* --all option was specified */
>  };
>
>
> IMO 'all' is too generic for a field name; it has almost no meaning --
> all what? I think the original 'all_databases' name was better. Or
> 'all_dbs', but not just 'all'.
>
> ~~~

Fixed.

>
> 4.
> + if (opt.all)
> + {
> + if (num_dbs > 0)
> + {
> + pg_log_error("%s cannot be used with %s", "--all", "--database");
> + pg_log_error_hint("Try \"%s --help\" for more information.", progname);
> + exit(1);
> + }
> +
> + if (num_pubs > 0)
> + {
> + pg_log_error("%s cannot be used with %s", "--all", "--publication");
> + pg_log_error_hint("Try \"%s --help\" for more information.", progname);
> + exit(1);
> + }
> +
> + if (num_replslots > 0)
> + {
> + pg_log_error("%s cannot be used with %s", "--all", "--replication-slot");
> + pg_log_error_hint("Try \"%s --help\" for more information.", progname);
> + exit(1);
> + }
> +
> + if (num_subs > 0)
> + {
> + pg_log_error("%s cannot be used with %s", "--all", "--subscription");
> + pg_log_error_hint("Try \"%s --help\" for more information.", progname);
> + exit(1);
> + }
>
> 4a
> If bad combinations exist then IMO it is most likely that the --all
> came before the incompatible option. So the parameters should be the
> other way around so the result is "--<XXX> cannot be used with --all".
>
> ~

Fixed.

>
> 4b.
> You could eliminate all this code duplication by using a variable for
> the bad option,
>
> SUGGESTION
> char *bad_switch = NULL;
>
> if (num_dbs > 0)            bad_switch = "--database";
> else if (num_pubs > 0)      bad_switch = "--publication";
> else if (num_replslots > 0) bad_switch = "--replication_slot";
> else if (num_subs > 0)      bad_switch = "--subscription";
>
> if (bad_switch)
> {
>   pg_log_error("%s cannot be used with --all", bad_switch);
>   pg_log_error_hint("Try \"%s --help\" for more information.", progname);
>   exit(1);
> }
>
> ~
>

Fixed.

> 4c.
> There is a double-blank line after this code fragment.
>

Fixed.

> ======
> .../t/040_pg_createsubscriber.pl
>
> 5.
> +# run pg_createsubscriber with '--database' and '--all' and verify the
> +# failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--database' => $db1,
> + '--all',
> + ],
> + qr/--all cannot be used with --database/,
> + 'fail if --all is used with --database');
> +
>
> There is no difference anymore in the logic/error if the options are
> given in order "--all --database"  or if they are in order "--database
> --all". So you no longer need to have separate test cases for
> different ordering.
>
> ~~~

Fixed.

>
> 6.
> +# Verify that only user databases got subscriptions (not template databases)
> +my @user_dbs = ($db1, $db2);
> +foreach my $dbname (@user_dbs)
> +{
> + my $sub_exists =
> +   $node_s1->safe_psql($dbname, "SELECT count(*) FROM pg_subscription;");
> + is($sub_exists, '3', "Subscription created successfully for $dbname");
> +}
>
> AFAICT this is just checking a global subscription count in a loop so
> it will be 3, 3, 3. Why?
>
> I guess you intended to verify that relevant subscriptions were
> created for each of the target databases. But this code is not doing
> that.
>
> ~~~

Fixed.

>
> 7.
> +# Verify replication is working
> +$node_p->safe_psql($db1, "INSERT INTO tbl1 VALUES ('replication test');");
> +
> +$result = $node_s1->safe_psql($db1, "SELECT * FROM tbl1 ORDER BY 1");
> +is( $result, qq(first row
> +replication test
> +second row), "replication successful in $db1");
> +
> +$node_s1->stop;
>  # Run pg_createsubscriber on node S.  --verbose is used twice
>  # to show more information.
>  command_ok(
> @@ -431,8 +590,9 @@ $result = $node_s->safe_psql($db1,
>  is($result, qq(0), 'failover slot was removed');
>
>  # Check result in database $db1
> -$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1');
> +$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1 ORDER BY 1');
>  is( $result, qq(first row
> +replication test
>
> I concur with Kuroda-san's comment. Maybe remove all this, because
> AFAICT existing code was already testing that replication is working
> ok for $db1 and $db2.
>
> ======

Fixed.

The attached patch at [1] contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjLtgrA9odXtwhit1mUfqogNSF4qhkvDrPbxEoWba%2B4SOw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Dear Shubham,

Thanks for updating the patch quickly!

> > 04.
> > ```
> > +# Verify that only user databases got subscriptions (not template databases)
> > +my @user_dbs = ($db1, $db2);
> > +foreach my $dbname (@user_dbs)
> > +{
> > +       my $sub_exists =
> > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> pg_subscription;");
> > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > +}
> > ```
> >
> > Hmm, what do you want to check here? pg_subscription is a global catalog so
> that
> > very loop will see exactly the same content. Also, 'postgres' is also the user
> database.
> > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> a
> > subscription here.
> >
> 
> Fixed.

My point was that the loop does not have meaning because pg_subscription
is a global one. I and Peter considered changes like [1] is needed. It ensures
that each databases have a subscription.
Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
characters. Please escape appropriately.

Other comments are listed in below.

01.
```
+     <term><option>-a</option></term>
+     <term><option>--all-dbs</option></term>
```

Peter's comment [2] does not say that option name should be changed.
The scope of his comment is only in the .c file.

02.
```
+# Set up node S2 as standby linking to node P
+$node_p->backup('backup_3');
+my $node_s2 = PostgreSQL::Test::Cluster->new('node_s2');
```

I feel $node_s should be renamed to $node_s1, then you can use $node_s2.

Note that this change may not be needed based on the comment [3].
We may have to separate the test file.

[1]:
```
# Verify that only user databases got subscriptions (not template databases)
my @user_dbs = ('postgres', $db1, $db2);
foreach my $dbname (@user_dbs)
{
    $result =
        $node_s2->safe_psql('postgres',
            "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and datname =
'$dbname';");

    is($result, '1', "Subscription created successfully for $dbname");
}
```
[2]: https://www.postgresql.org/message-id/CAHut%2BPsatTfk9-F4JBrX_yYE0QGh4wQiTmvS4%3DdnBxcL%3DAK2HA%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CAExHW5vmMs5nZ6%3DXcCYAXMJrhVrsW7hOovyg%2BP%2BT9Pkuc7DykA%40mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Thu, Feb 20, 2025 at 10:26 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch quickly!
>
> > > 04.
> > > ```
> > > +# Verify that only user databases got subscriptions (not template databases)
> > > +my @user_dbs = ($db1, $db2);
> > > +foreach my $dbname (@user_dbs)
> > > +{
> > > +       my $sub_exists =
> > > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> > pg_subscription;");
> > > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > > +}
> > > ```
> > >
> > > Hmm, what do you want to check here? pg_subscription is a global catalog so
> > that
> > > very loop will see exactly the same content. Also, 'postgres' is also the user
> > database.
> > > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> > a
> > > subscription here.
> > >
> >
> > Fixed.
>
> My point was that the loop does not have meaning because pg_subscription
> is a global one. I and Peter considered changes like [1] is needed. It ensures
> that each databases have a subscription.
> Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
> characters. Please escape appropriately.

Yes. Some test is still needed to confirm the expected subscriptions
all get created for respective dbs. But, the current test loop just
isn't doing it properly.

>
> Other comments are listed in below.
>
> 01.
> ```
> +     <term><option>-a</option></term>
> +     <term><option>--all-dbs</option></term>
> ```
>
> Peter's comment [2] does not say that option name should be changed.
> The scope of his comment is only in the .c file.

Yes, that's correct. My v9 suggestion to change 'all' to 'all_dbs' was
referring only to the field name of struct CreateSubscriberOptions,
nothing else. Not the usage help, not the error messages, not the
docs, not the tests, not the commit message.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Fri, Feb 21, 2025 at 5:18 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Feb 20, 2025 at 10:26 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Shubham,
> >
> > Thanks for updating the patch quickly!
> >
> > > > 04.
> > > > ```
> > > > +# Verify that only user databases got subscriptions (not template databases)
> > > > +my @user_dbs = ($db1, $db2);
> > > > +foreach my $dbname (@user_dbs)
> > > > +{
> > > > +       my $sub_exists =
> > > > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> > > pg_subscription;");
> > > > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > > > +}
> > > > ```
> > > >
> > > > Hmm, what do you want to check here? pg_subscription is a global catalog so
> > > that
> > > > very loop will see exactly the same content. Also, 'postgres' is also the user
> > > database.
> > > > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> > > a
> > > > subscription here.
> > > >
> > >
> > > Fixed.
> >
> > My point was that the loop does not have meaning because pg_subscription
> > is a global one. I and Peter considered changes like [1] is needed. It ensures
> > that each databases have a subscription.
> > Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
> > characters. Please escape appropriately.
>
> Yes. Some test is still needed to confirm the expected subscriptions
> all get created for respective dbs. But, the current test loop just
> isn't doing it properly.
>
> >
> > Other comments are listed in below.
> >
> > 01.
> > ```
> > +     <term><option>-a</option></term>
> > +     <term><option>--all-dbs</option></term>
> > ```
> >
> > Peter's comment [2] does not say that option name should be changed.
> > The scope of his comment is only in the .c file.
>
> Yes, that's correct. My v9 suggestion to change 'all' to 'all_dbs' was
> referring only to the field name of struct CreateSubscriberOptions,
> nothing else. Not the usage help, not the error messages, not the
> docs, not the tests, not the commit message.

+1. We don't want yet another option to specify all databases. :)

--
Best Wishes,
Ashutosh Bapat



On Wed, Feb 19, 2025 at 4:23 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch!
>
> > I have added a test case for non-dry-run mode to ensure that
> > replication slots, subscriptions, and publications work as expected
> > when '--all' is specified. Additionally, I have split the test file
> > into two parts:
> > 1) Command-line sanity checks – Validates the correctness of option parsing.
> > 2) '--all' functionality and behavior – Verifies the health of
> > subscriptions and ensures that --all specific scenarios, such as
> > non-template databases not being subscribed to, are properly handled.
> > This should improve test coverage while keeping the structure manageable.
>
> TBH, I feel your change does not separate the test file into the two parts. ISTM
> you just added some validation checks and a test how --all option works.
> Ashutosh, does it match your suggestion?
>

Thanks Hayato for pointing it out. The test changes don't match my
expectations. As you rightly pointed out, I expected two (actually
three if needed) separate test files one for argument validation and
one for testing --database scenarios (the existing tests) and one more
for testing same scenarios when --all is specified. Right now all it
does is "# Verify that only user databases got subscriptions (not
template databases)". I expected testing the actual replication as
well like tests between lines around 527 and 582 in the latest
patchset. Those tests are performed when --database is subscribed. We
need similar tests performed when --all is specified. I didn't find
any of those tests being performed on node_s2. Given that the tests
for --databases and --all will be very similar, having them in the
same test file makes more sense. We also seem to be missing
$node_s2->teardown_node, do we?

--
Best Wishes,
Ashutosh Bapat



On Thu, Feb 20, 2025 at 4:56 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch quickly!
>
> > > 04.
> > > ```
> > > +# Verify that only user databases got subscriptions (not template databases)
> > > +my @user_dbs = ($db1, $db2);
> > > +foreach my $dbname (@user_dbs)
> > > +{
> > > +       my $sub_exists =
> > > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> > pg_subscription;");
> > > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > > +}
> > > ```
> > >
> > > Hmm, what do you want to check here? pg_subscription is a global catalog so
> > that
> > > very loop will see exactly the same content. Also, 'postgres' is also the user
> > database.
> > > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> > a
> > > subscription here.
> > >
> >
> > Fixed.
>
> My point was that the loop does not have meaning because pg_subscription
> is a global one. I and Peter considered changes like [1] is needed. It ensures
> that each databases have a subscription.
> Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
> characters. Please escape appropriately.
>

Fixed.

> Other comments are listed in below.
>
> 01.
> ```
> +     <term><option>-a</option></term>
> +     <term><option>--all-dbs</option></term>
> ```
>
> Peter's comment [2] does not say that option name should be changed.
> The scope of his comment is only in the .c file.
>

Fixed.

> 02.
> ```
> +# Set up node S2 as standby linking to node P
> +$node_p->backup('backup_3');
> +my $node_s2 = PostgreSQL::Test::Cluster->new('node_s2');
> ```
>
> I feel $node_s should be renamed to $node_s1, then you can use $node_s2.
>
> Note that this change may not be needed based on the comment [3].

Fixed.

> We may have to separate the test file.
>
> [1]:
> ```
> # Verify that only user databases got subscriptions (not template databases)
> my @user_dbs = ('postgres', $db1, $db2);
> foreach my $dbname (@user_dbs)
> {
>         $result =
>                 $node_s2->safe_psql('postgres',
>                         "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and
datname= '$dbname';"); 
>
>         is($result, '1', "Subscription created successfully for $dbname");
> }
> ```

Fixed.

> [2]: https://www.postgresql.org/message-id/CAHut%2BPsatTfk9-F4JBrX_yYE0QGh4wQiTmvS4%3DdnBxcL%3DAK2HA%40mail.gmail.com
> [3]: https://www.postgresql.org/message-id/CAExHW5vmMs5nZ6%3DXcCYAXMJrhVrsW7hOovyg%2BP%2BT9Pkuc7DykA%40mail.gmail.com
>

The attached Patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Fri, Feb 21, 2025 at 5:18 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Feb 20, 2025 at 10:26 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Shubham,
> >
> > Thanks for updating the patch quickly!
> >
> > > > 04.
> > > > ```
> > > > +# Verify that only user databases got subscriptions (not template databases)
> > > > +my @user_dbs = ($db1, $db2);
> > > > +foreach my $dbname (@user_dbs)
> > > > +{
> > > > +       my $sub_exists =
> > > > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> > > pg_subscription;");
> > > > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > > > +}
> > > > ```
> > > >
> > > > Hmm, what do you want to check here? pg_subscription is a global catalog so
> > > that
> > > > very loop will see exactly the same content. Also, 'postgres' is also the user
> > > database.
> > > > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> > > a
> > > > subscription here.
> > > >
> > >
> > > Fixed.
> >
> > My point was that the loop does not have meaning because pg_subscription
> > is a global one. I and Peter considered changes like [1] is needed. It ensures
> > that each databases have a subscription.
> > Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
> > characters. Please escape appropriately.
>
> Yes. Some test is still needed to confirm the expected subscriptions
> all get created for respective dbs. But, the current test loop just
> isn't doing it properly.
>

Fixed.

> >
> > Other comments are listed in below.
> >
> > 01.
> > ```
> > +     <term><option>-a</option></term>
> > +     <term><option>--all-dbs</option></term>
> > ```
> >
> > Peter's comment [2] does not say that option name should be changed.
> > The scope of his comment is only in the .c file.
>
> Yes, that's correct. My v9 suggestion to change 'all' to 'all_dbs' was
> referring only to the field name of struct CreateSubscriberOptions,
> nothing else. Not the usage help, not the error messages, not the
> docs, not the tests, not the commit message.
>

Fixed.

The attached patch at [1] contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjJ9BsCg%2Bpur307b1JnfQebnmxFZLw4LdcGX7f-%3D6OK1vw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Fri, Feb 21, 2025 at 9:44 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Feb 21, 2025 at 5:18 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Feb 20, 2025 at 10:26 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear Shubham,
> > >
> > > Thanks for updating the patch quickly!
> > >
> > > > > 04.
> > > > > ```
> > > > > +# Verify that only user databases got subscriptions (not template databases)
> > > > > +my @user_dbs = ($db1, $db2);
> > > > > +foreach my $dbname (@user_dbs)
> > > > > +{
> > > > > +       my $sub_exists =
> > > > > +         $node_s1->safe_psql($dbname, "SELECT count(*) FROM
> > > > pg_subscription;");
> > > > > +       is($sub_exists, '3', "Subscription created successfully for $dbname");
> > > > > +}
> > > > > ```
> > > > >
> > > > > Hmm, what do you want to check here? pg_subscription is a global catalog so
> > > > that
> > > > > very loop will see exactly the same content. Also, 'postgres' is also the user
> > > > database.
> > > > > I feel you must ensure that all three databases (postgres, $db1, and $db2) have
> > > > a
> > > > > subscription here.
> > > > >
> > > >
> > > > Fixed.
> > >
> > > My point was that the loop does not have meaning because pg_subscription
> > > is a global one. I and Peter considered changes like [1] is needed. It ensures
> > > that each databases have a subscription.
> > > Note: [1] cannot pass the test as-is because $db1 and $db2 contains special
> > > characters. Please escape appropriately.
> >
> > Yes. Some test is still needed to confirm the expected subscriptions
> > all get created for respective dbs. But, the current test loop just
> > isn't doing it properly.
> >
> > >
> > > Other comments are listed in below.
> > >
> > > 01.
> > > ```
> > > +     <term><option>-a</option></term>
> > > +     <term><option>--all-dbs</option></term>
> > > ```
> > >
> > > Peter's comment [2] does not say that option name should be changed.
> > > The scope of his comment is only in the .c file.
> >
> > Yes, that's correct. My v9 suggestion to change 'all' to 'all_dbs' was
> > referring only to the field name of struct CreateSubscriberOptions,
> > nothing else. Not the usage help, not the error messages, not the
> > docs, not the tests, not the commit message.
>
> +1. We don't want yet another option to specify all databases. :)
>

Fixed.

The attached patch at [1] contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjJ9BsCg%2Bpur307b1JnfQebnmxFZLw4LdcGX7f-%3D6OK1vw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Fri, Feb 21, 2025 at 10:18 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Feb 19, 2025 at 4:23 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Shubham,
> >
> > Thanks for updating the patch!
> >
> > > I have added a test case for non-dry-run mode to ensure that
> > > replication slots, subscriptions, and publications work as expected
> > > when '--all' is specified. Additionally, I have split the test file
> > > into two parts:
> > > 1) Command-line sanity checks – Validates the correctness of option parsing.
> > > 2) '--all' functionality and behavior – Verifies the health of
> > > subscriptions and ensures that --all specific scenarios, such as
> > > non-template databases not being subscribed to, are properly handled.
> > > This should improve test coverage while keeping the structure manageable.
> >
> > TBH, I feel your change does not separate the test file into the two parts. ISTM
> > you just added some validation checks and a test how --all option works.
> > Ashutosh, does it match your suggestion?
> >
>
> Thanks Hayato for pointing it out. The test changes don't match my
> expectations. As you rightly pointed out, I expected two (actually
> three if needed) separate test files one for argument validation and
> one for testing --database scenarios (the existing tests) and one more
> for testing same scenarios when --all is specified. Right now all it
> does is "# Verify that only user databases got subscriptions (not
> template databases)". I expected testing the actual replication as
> well like tests between lines around 527 and 582 in the latest
> patchset. Those tests are performed when --database is subscribed. We
> need similar tests performed when --all is specified. I didn't find
> any of those tests being performed on node_s2. Given that the tests
> for --databases and --all will be very similar, having them in the
> same test file makes more sense. We also seem to be missing
> $node_s2->teardown_node, do we?
>

I agree with your point. Right now, my focus is on fixing the patch
first, and I plan to split the test files at the end. That approach
might help streamline the process.
As of the latest patch attached at [1], this issue remains unresolved.
I will proceed with splitting the tests once all other issues are
addressed.

[1] - https://www.postgresql.org/message-id/CAHv8RjJ9BsCg%2Bpur307b1JnfQebnmxFZLw4LdcGX7f-%3D6OK1vw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Dear Shubham,

Thanks for updating the patch.

I think the modification [1] is not correct - the loop is meaningless because the same
query would be executed every time. How about idea like attached? Here, instead of
try escaping dbname, dbname is directly obtained from the instance and they are compared.

How do you think?

[1]:
```
+# Verify that only user databases got subscriptions (not template databases)
+my @user_dbs = ('postgres', $db1, $db2);
+foreach my $dbname (@user_dbs)
+{
+    $result = $node_s2->safe_psql('postgres',
+        "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and datistemplate = 'f';"
+    );
+    is($result, '3', "Subscription created successfully for $dbname");
+    $result = $node_s2->safe_psql('postgres',
+        "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and datistemplate = 't';"
+    );
+    is($result, '0', "Subscription created successfully for $dbname");
+}
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment
On Fri, Feb 28, 2025 at 11:34 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
>
> The attached Patch contains the suggested changes.
>
> Thanks and regards,
> Shubham Khanna.

Some comments:
1.
+    <varlistentry>
+     <term><option>-a</option></term>
+     <term><option>--all</option></term>
+     <listitem>
+      <para>
+       For all source server non-template databases create subscriptions for
+       create subscriptions for databases with the same names on the
+       target server.
+       Subscription names, publication names, and replication slot names are
+       automatically generated. Cannot be used together with
+       <option>--database</option>, <option>--publication</option>,
+       <option>--replication-slot</option> or <option>--subscription</option>.

Don't start the sentence with "Cannot". Change the sentence to "This
option cannot be used together with ..."
similar sentences used in 3 other places below this as well. Change all of them.

2.
+# Verify that only user databases got subscriptions (not template databases)

change to "Verify that only user databases have subscriptions"

regards,
Ajin Cherian
Fujitsu Australia



On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:

> +       Subscription names, publication names, and replication slot names are
> +       automatically generated. Cannot be used together with
> +       <option>--database</option>, <option>--publication</option>,
> +       <option>--replication-slot</option> or <option>--subscription</option>.
>
> Don't start the sentence with "Cannot". Change the sentence to "This
> option cannot be used together with ..."
> similar sentences used in 3 other places below this as well. Change all of them.

I didn't find any instance of "Cannot be" in the *.sgml files, but I
find some instances of "Can be". So it seems we allow such constructs
in the documentation. Any reasons for suggesting this change?

--
Best Wishes,
Ashutosh Bapat



On Mon, Mar 10, 2025 at 3:58 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> > +       Subscription names, publication names, and replication slot names are
> > +       automatically generated. Cannot be used together with
> > +       <option>--database</option>, <option>--publication</option>,
> > +       <option>--replication-slot</option> or <option>--subscription</option>.
> >
> > Don't start the sentence with "Cannot". Change the sentence to "This
> > option cannot be used together with ..."
> > similar sentences used in 3 other places below this as well. Change all of them.
>
> I didn't find any instance of "Cannot be" in the *.sgml files, but I
> find some instances of "Can be". So it seems we allow such constructs
> in the documentation. Any reasons for suggesting this change?
>

I just don't think it is correct English to start a sentence with
"Cannot be". I checked with grammarly as well.

regards,
Ajin Cherian
Fujitsu Australia



On Fri, Feb 28, 2025 at 6:33 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> Thanks for updating the patch.
>
> I think the modification [1] is not correct - the loop is meaningless because the same
> query would be executed every time. How about idea like attached? Here, instead of
> try escaping dbname, dbname is directly obtained from the instance and they are compared.
>
> How do you think?
>
> [1]:
> ```
> +# Verify that only user databases got subscriptions (not template databases)
> +my @user_dbs = ('postgres', $db1, $db2);
> +foreach my $dbname (@user_dbs)
> +{
> +       $result = $node_s2->safe_psql('postgres',
> +               "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and datistemplate
='f';" 
> +       );
> +       is($result, '3', "Subscription created successfully for $dbname");
> +       $result = $node_s2->safe_psql('postgres',
> +               "SELECT count(*) FROM pg_subscription, pg_database WHERE subdbid = pg_database.oid and datistemplate
='t';" 
> +       );
> +       is($result, '0', "Subscription created successfully for $dbname");
> +}
> ```
>

I agree with your suggestion and have incorporated the proposed
changes in the latest patch. Instead of escaping dbname, I now fetch
it directly from the instance for comparison, making the loop more
meaningful.

Additionally, as suggested by Ashutosh in [1], I have split the
040_pg_createsubscriber.pl file into three parts to improve clarity.

The attached patch includes all the suggested changes.

[1] - https://www.postgresql.org/message-id/CAExHW5uJHYAge99oS_iPfGWwZ_eCr2xFCNnifQkGs2GXeMQKGQ%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Fri, Feb 28, 2025 at 11:34 PM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> >
> > The attached Patch contains the suggested changes.
> >
> > Thanks and regards,
> > Shubham Khanna.
>
> Some comments:
> 1.
> +    <varlistentry>
> +     <term><option>-a</option></term>
> +     <term><option>--all</option></term>
> +     <listitem>
> +      <para>
> +       For all source server non-template databases create subscriptions for
> +       create subscriptions for databases with the same names on the
> +       target server.
> +       Subscription names, publication names, and replication slot names are
> +       automatically generated. Cannot be used together with
> +       <option>--database</option>, <option>--publication</option>,
> +       <option>--replication-slot</option> or <option>--subscription</option>.
>
> Don't start the sentence with "Cannot". Change the sentence to "This
> option cannot be used together with ..."
> similar sentences used in 3 other places below this as well. Change all of them.
>

Fixed.

> 2.
> +# Verify that only user databases got subscriptions (not template databases)
>
> change to "Verify that only user databases have subscriptions"
>

Fixed.

The attached patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Mon, Mar 10, 2025 at 10:28 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> > +       Subscription names, publication names, and replication slot names are
> > +       automatically generated. Cannot be used together with
> > +       <option>--database</option>, <option>--publication</option>,
> > +       <option>--replication-slot</option> or <option>--subscription</option>.
> >
> > Don't start the sentence with "Cannot". Change the sentence to "This
> > option cannot be used together with ..."
> > similar sentences used in 3 other places below this as well. Change all of them.
>
> I didn't find any instance of "Cannot be" in the *.sgml files, but I
> find some instances of "Can be". So it seems we allow such constructs
> in the documentation. Any reasons for suggesting this change?
>
> --

Updated the .sgml file accordingly. The attached patch at [1] contains
the required changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2B37ja589BzqB5bz0ZYGWb5gpnP9of8SoqKc%3DDqLmvxBg%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Mon, Mar 10, 2025 at 11:04 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Mon, Mar 10, 2025 at 3:58 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > > +       Subscription names, publication names, and replication slot names are
> > > +       automatically generated. Cannot be used together with
> > > +       <option>--database</option>, <option>--publication</option>,
> > > +       <option>--replication-slot</option> or <option>--subscription</option>.
> > >
> > > Don't start the sentence with "Cannot". Change the sentence to "This
> > > option cannot be used together with ..."
> > > similar sentences used in 3 other places below this as well. Change all of them.
> >
> > I didn't find any instance of "Cannot be" in the *.sgml files, but I
> > find some instances of "Can be". So it seems we allow such constructs
> > in the documentation. Any reasons for suggesting this change?
> >
>
> I just don't think it is correct English to start a sentence with
> "Cannot be". I checked with grammarly as well.
>

Fixed.

The attached patch at [1] contains the required changes.

[1] - https://www.postgresql.org/message-id/CAHv8Rj%2B37ja589BzqB5bz0ZYGWb5gpnP9of8SoqKc%3DDqLmvxBg%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Hi Shubham,

Here are a few comments for the v12 patch.

doc/src/sgml/ref/pg_createsubscriber.sgml :

1.
+      <para>
+       For all source server non-template databases create subscriptions for
+       create subscriptions for databases with the same names on the
+       target server.

is “create subscriptions for” repeated? The sentence doesn’t make sense.

2. Typo
+       switches. This option cannot be together with <option>--all</option>.

/cannot be together/cannot be used together/
~~~

3. src/bin/pg_basebackup/pg_createsubscriber.c :
+ /* Establish a connection to the PostgreSQL server */
+ conn = connect_database(opt->pub_conninfo_str, true);

I think comment will be more clear if say “ Establish a connection to
the primary server */ or “source server”
~~~

src/bin/pg_basebackup/t/042_all_option.pl :

4. As per Ashutosh's comment at [1], tests need to be added to verify
logical replication behavior after using the --all option.
~~~~

Please refer to the attached top-up fix patch, which includes the
above changes along with some cosmetic fixes in the test file
042_all_option.pl.

[1] https://www.postgresql.org/message-id/CAExHW5uJHYAge99oS_iPfGWwZ_eCr2xFCNnifQkGs2GXeMQKGQ%40mail.gmail.com

--
Thanks,
Nisha

Attachment
On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Hi Shubham,
>
> Here are a few comments for the v12 patch.
>
> doc/src/sgml/ref/pg_createsubscriber.sgml :
>
> 1.
> +      <para>
> +       For all source server non-template databases create subscriptions for
> +       create subscriptions for databases with the same names on the
> +       target server.
>
> is “create subscriptions for” repeated? The sentence doesn’t make sense.
>

Fixed.

> 2. Typo
> +       switches. This option cannot be together with <option>--all</option>.
>
> /cannot be together/cannot be used together/
> ~~~
>

Fixed.

> 3. src/bin/pg_basebackup/pg_createsubscriber.c :
> + /* Establish a connection to the PostgreSQL server */
> + conn = connect_database(opt->pub_conninfo_str, true);
>
> I think comment will be more clear if say “ Establish a connection to
> the primary server */ or “source server”
> ~~~
>

Fixed.

> src/bin/pg_basebackup/t/042_all_option.pl :
>
> 4. As per Ashutosh's comment at [1], tests need to be added to verify
> logical replication behavior after using the --all option.
> ~~~~
>
> Please refer to the attached top-up fix patch, which includes the
> above changes along with some cosmetic fixes in the test file
> 042_all_option.pl.
>
> [1] https://www.postgresql.org/message-id/CAExHW5uJHYAge99oS_iPfGWwZ_eCr2xFCNnifQkGs2GXeMQKGQ%40mail.gmail.com
>
> --

Fixed.

The attached patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Mon, 17 Mar 2025 at 11:28, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Fixed.
>
> The attached patch contains the suggested changes.

I feel like we're trying to address two separate tasks in this thread:
a) Enhancing pg_createsubscriber to automatically retrieve databases
when none is provided. b) Refactoring all pg_createsubscriber tests.

I suggest we keep this thread focused solely on retrieving all
databases and start a new thread for test refactoring. This will allow
us to tackle each task separately and ensure a cleaner commit.

Regards,
Vignesh



On Mon, Mar 17, 2025 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 17 Mar 2025 at 11:28, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Fixed.
> >
> > The attached patch contains the suggested changes.
>
> I feel like we're trying to address two separate tasks in this thread:
> a) Enhancing pg_createsubscriber to automatically retrieve databases
> when none is provided. b) Refactoring all pg_createsubscriber tests.
>
> I suggest we keep this thread focused solely on retrieving all
> databases and start a new thread for test refactoring. This will allow
> us to tackle each task separately and ensure a cleaner commit.

I was expecting that the argument validation tests will go in one test
- existing as well as the ones for --all. But that's not how the patch
is splitting them. It has split only the existing test. I am fine if
we add a new test for --all option as the [patch does and leave the
existing test as is. Cramming everything in one test makes it
unmaintainable, though.

--
Best Wishes,
Ashutosh Bapat



On Mon, 17 Mar 2025 at 16:51, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Mon, Mar 17, 2025 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, 17 Mar 2025 at 11:28, Shubham Khanna
> > <khannashubham1197@gmail.com> wrote:
> > >
> > > On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > Fixed.
> > >
> > > The attached patch contains the suggested changes.
> >
> > I feel like we're trying to address two separate tasks in this thread:
> > a) Enhancing pg_createsubscriber to automatically retrieve databases
> > when none is provided. b) Refactoring all pg_createsubscriber tests.
> >
> > I suggest we keep this thread focused solely on retrieving all
> > databases and start a new thread for test refactoring. This will allow
> > us to tackle each task separately and ensure a cleaner commit.
>
> I was expecting that the argument validation tests will go in one test
> - existing as well as the ones for --all. But that's not how the patch
> is splitting them. It has split only the existing test. I am fine if
> we add a new test for --all option as the [patch does and leave the
> existing test as is. Cramming everything in one test makes it
> unmaintainable, though.

For this patch, let's add an additional test case to
040_pg_createsubscriber.pl and aim to commit it soon, as the rest of
the changes look good. I agree that the test split you suggested is
necessary, but let's handle that in a separate thread.

Regards,
Vignesh



On Tue, Mar 18, 2025 at 4:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 17 Mar 2025 at 16:51, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Mon, Mar 17, 2025 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Mon, 17 Mar 2025 at 11:28, Shubham Khanna
> > > <khannashubham1197@gmail.com> wrote:
> > > >
> > > > On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > Fixed.
> > > >
> > > > The attached patch contains the suggested changes.
> > >
> > > I feel like we're trying to address two separate tasks in this thread:
> > > a) Enhancing pg_createsubscriber to automatically retrieve databases
> > > when none is provided. b) Refactoring all pg_createsubscriber tests.
> > >
> > > I suggest we keep this thread focused solely on retrieving all
> > > databases and start a new thread for test refactoring. This will allow
> > > us to tackle each task separately and ensure a cleaner commit.
> >
> > I was expecting that the argument validation tests will go in one test
> > - existing as well as the ones for --all. But that's not how the patch
> > is splitting them. It has split only the existing test. I am fine if
> > we add a new test for --all option as the [patch does and leave the
> > existing test as is. Cramming everything in one test makes it
> > unmaintainable, though.
>
> For this patch, let's add an additional test case to
> 040_pg_createsubscriber.pl and aim to commit it soon, as the rest of
> the changes look good. I agree that the test split you suggested is
> necessary, but let's handle that in a separate thread.

I am fine with it. I am just worried about the resultant test being
unreadable and patch hard to review. That's how the first patch was
written.


--
Best Wishes,
Ashutosh Bapat



On Tue, Mar 18, 2025 at 4:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 17 Mar 2025 at 16:51, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Mon, Mar 17, 2025 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Mon, 17 Mar 2025 at 11:28, Shubham Khanna
> > > <khannashubham1197@gmail.com> wrote:
> > > >
> > > > On Fri, Mar 14, 2025 at 5:43 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > Fixed.
> > > >
> > > > The attached patch contains the suggested changes.
> > >
> > > I feel like we're trying to address two separate tasks in this thread:
> > > a) Enhancing pg_createsubscriber to automatically retrieve databases
> > > when none is provided. b) Refactoring all pg_createsubscriber tests.
> > >
> > > I suggest we keep this thread focused solely on retrieving all
> > > databases and start a new thread for test refactoring. This will allow
> > > us to tackle each task separately and ensure a cleaner commit.
> >
> > I was expecting that the argument validation tests will go in one test
> > - existing as well as the ones for --all. But that's not how the patch
> > is splitting them. It has split only the existing test. I am fine if
> > we add a new test for --all option as the [patch does and leave the
> > existing test as is. Cramming everything in one test makes it
> > unmaintainable, though.
>
> For this patch, let's add an additional test case to
> 040_pg_createsubscriber.pl and aim to commit it soon, as the rest of
> the changes look good. I agree that the test split you suggested is
> necessary, but let's handle that in a separate thread.
>

I have added an additional test case to 040_pg_createsubscriber.pl as suggested.

The attached patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Tue, 18 Mar 2025 at 17:34, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> I have added an additional test case to 040_pg_createsubscriber.pl as suggested.
>
> The attached patch contains the suggested changes.

How about we change the below code:
+# Verify that user databases (postgres, $db1, $db2) got subscriptions. Both
+# $db1 and $db2 must be escaped to pass the safe_psql(), but it is difficult.
+# Thus, we define a cursor, obtain a dbname from the instance and compere one
+# by one.
+my @user_dbs = ('postgres', $db1, $db2);
+
+my $bgconn = $node_u->background_psql('postgres');
+$bgconn->query_safe(
+       qq[
+       BEGIN;
+       DECLARE cursor CURSOR FOR SELECT datname FROM pg_subscription,
pg_database
+       WHERE subdbid = pg_database.oid and datistemplate = 'f'
+       ORDER BY pg_database.oid;
+]);
+
+# Fetch from the cursor three times and confirm the existence of the
+# subscription on $dbname
+foreach my $dbname (@user_dbs)
+{
+       my $result = $bgconn->query_safe("FETCH cursor;");
+
+       is($result, $dbname, "subscription is created on $dbname");
+}

like:
$result = $node_u->safe_psql($db1, 'SELECT datname FROM
pg_subscription, pg_database WHERE subdbid = pg_database.oid and
datistemplate = \'f\' ORDER BY pg_database.oid');
is($result, "postgres
$db1
$db2", 'subscription is created on the required databases');

I felt this might simplify your verification logic.

Regards,
Vignesh



On Wed, Mar 19, 2025 at 11:17 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 18 Mar 2025 at 17:34, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > I have added an additional test case to 040_pg_createsubscriber.pl as suggested.
> >
> > The attached patch contains the suggested changes.
>
> How about we change the below code:
> +# Verify that user databases (postgres, $db1, $db2) got subscriptions. Both
> +# $db1 and $db2 must be escaped to pass the safe_psql(), but it is difficult.
> +# Thus, we define a cursor, obtain a dbname from the instance and compere one
> +# by one.
> +my @user_dbs = ('postgres', $db1, $db2);
> +
> +my $bgconn = $node_u->background_psql('postgres');
> +$bgconn->query_safe(
> +       qq[
> +       BEGIN;
> +       DECLARE cursor CURSOR FOR SELECT datname FROM pg_subscription,
> pg_database
> +       WHERE subdbid = pg_database.oid and datistemplate = 'f'
> +       ORDER BY pg_database.oid;
> +]);
> +
> +# Fetch from the cursor three times and confirm the existence of the
> +# subscription on $dbname
> +foreach my $dbname (@user_dbs)
> +{
> +       my $result = $bgconn->query_safe("FETCH cursor;");
> +
> +       is($result, $dbname, "subscription is created on $dbname");
> +}
>
> like:
> $result = $node_u->safe_psql($db1, 'SELECT datname FROM
> pg_subscription, pg_database WHERE subdbid = pg_database.oid and
> datistemplate = \'f\' ORDER BY pg_database.oid');
> is($result, "postgres
> $db1
> $db2", 'subscription is created on the required databases');
>
> I felt this might simplify your verification logic.
>

I agree with you on this; switching to a single query with safe_psql()
will indeed simplify the test and make the verification logic cleaner.

The attached patch contains the suggested change.

Thanks and regards,
Shubham Khanna.

Attachment
On Wed, 19 Mar 2025 at 11:44, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
>
> I agree with you on this; switching to a single query with safe_psql()
> will indeed simplify the test and make the verification logic cleaner.
>
> The attached patch contains the suggested change.

Few comments:
1) I felt we are not doing anything specific to dry-run, so no need to
have this test case:
+# run pg_createsubscriber with '--all' and '--database' and verify the
+# failure
+command_fails_like(
+       [
+               'pg_createsubscriber',
+               '--verbose',
+               '--dry-run',
+               '--pgdata' => $node_s->data_dir,
+               '--publisher-server' => $node_p->connstr($db1),
+               '--socketdir' => $node_s->host,
+               '--subscriber-port' => $node_s->port,
+               '--all',
+               '--database' => $db1,
+       ],
+       qr/--database cannot be used with --all/,
+       'fail if --database is used with --all');
+

2) Similarly this too:
+# run pg_createsubscriber with '--all' option
+command_ok(
+       [
+               'pg_createsubscriber',
+               '--verbose',
+               '--dry-run',
+               '--pgdata' => $node_s->data_dir,
+               '--publisher-server' => $node_p->connstr($db1),
+               '--socketdir' => $node_s->host,
+               '--subscriber-port' => $node_s->port,
+               '--all',
+       ],
+       'run pg_createsubscriber with --all');
+

3) We have run other pg_createsubscriber success tests with
recovery-timeout specified, so I suggest modifying this too similarly:
+# run pg_createsubscriber with '--all' option without '--dry-run'
+command_ok(
+       [
+               'pg_createsubscriber',
+               '--verbose',
+               '--pgdata' => $node_u->data_dir,
+               '--publisher-server' => $node_p->connstr($db1),
+               '--socketdir' => $node_u->host,
+               '--subscriber-port' => $node_u->port,
+               '--all',
+       ],
+       'run pg_createsubscriber with --all');
+
+$node_u->start;

4) You can try to see how much extra time it takes to run the tests,
if takes more time, then you can think of the following changes a) few
of the option validation test cases can be removed, we can have just
one combination of all with either of
publication/subscription/replication slot b) for the last test added,
you can drop one of the dbs and verify the subscriptions are created
in 2 dbs as the code flow is the same for all databases.

Regards,
Vignesh



On Wed, Mar 19, 2025 at 2:29 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 19 Mar 2025 at 11:44, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> >
> > I agree with you on this; switching to a single query with safe_psql()
> > will indeed simplify the test and make the verification logic cleaner.
> >
> > The attached patch contains the suggested change.
>
> Few comments:
> 1) I felt we are not doing anything specific to dry-run, so no need to
> have this test case:
> +# run pg_createsubscriber with '--all' and '--database' and verify the
> +# failure
> +command_fails_like(
> +       [
> +               'pg_createsubscriber',
> +               '--verbose',
> +               '--dry-run',
> +               '--pgdata' => $node_s->data_dir,
> +               '--publisher-server' => $node_p->connstr($db1),
> +               '--socketdir' => $node_s->host,
> +               '--subscriber-port' => $node_s->port,
> +               '--all',
> +               '--database' => $db1,
> +       ],
> +       qr/--database cannot be used with --all/,
> +       'fail if --database is used with --all');
> +
>

Fixed.

> 2) Similarly this too:
> +# run pg_createsubscriber with '--all' option
> +command_ok(
> +       [
> +               'pg_createsubscriber',
> +               '--verbose',
> +               '--dry-run',
> +               '--pgdata' => $node_s->data_dir,
> +               '--publisher-server' => $node_p->connstr($db1),
> +               '--socketdir' => $node_s->host,
> +               '--subscriber-port' => $node_s->port,
> +               '--all',
> +       ],
> +       'run pg_createsubscriber with --all');
> +
>

Fixed.

> 3) We have run other pg_createsubscriber success tests with
> recovery-timeout specified, so I suggest modifying this too similarly:
> +# run pg_createsubscriber with '--all' option without '--dry-run'
> +command_ok(
> +       [
> +               'pg_createsubscriber',
> +               '--verbose',
> +               '--pgdata' => $node_u->data_dir,
> +               '--publisher-server' => $node_p->connstr($db1),
> +               '--socketdir' => $node_u->host,
> +               '--subscriber-port' => $node_u->port,
> +               '--all',
> +       ],
> +       'run pg_createsubscriber with --all');
> +
> +$node_u->start;
>

Fixed.

> 4) You can try to see how much extra time it takes to run the tests,
> if takes more time, then you can think of the following changes a) few
> of the option validation test cases can be removed, we can have just
> one combination of all with either of
> publication/subscription/replication slot b) for the last test added,
> you can drop one of the dbs and verify the subscriptions are created
> in 2 dbs as the code flow is the same for all databases.
>

I have created two patches, v16-0001 and v16-0002, to address the
performance issue. I conducted performance testing, and here are the
results:
- The difference in execution time between HEAD and the v15 patch was 53.2%.
- After removing the suggested test cases, the difference reduced to
36.43%, showing a significant improvement.

The optimizations included:
- Reducing redundant option validation test cases by retaining only
one combination of publication/subscription/replication slot.
- Dropping one of the databases in the last test case and verifying
that subscriptions are created correctly in the remaining two
databases, as the code flow remains consistent across databases.

The attached patches contain the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Thu, 20 Mar 2025 at 10:25, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
>
> I have created two patches, v16-0001 and v16-0002, to address the
> performance issue. I conducted performance testing, and here are the
> results:
> - The difference in execution time between HEAD and the v15 patch was 53.2%.
> - After removing the suggested test cases, the difference reduced to
> 36.43%, showing a significant improvement.
>

It is still taking quite a while, can we compare with the following
too see how much extra time it takes:
a) remove insert and select verification for the all database
successful tests as all of these are logical replication verification
which is extensively tested b) remove command_fails_like failure tests
c) remove both of above and see.

Regards,
Vignesh



On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
<khannashubham1197@gmail.com> wrote:

>
> The attached patches contain the suggested changes.
>

I have started reviewing the patches again. Here are some review comments

    <variablelist>
+    <varlistentry>
+     <term><option>-a</option></term>
+     <term><option>--all</option></term>
+     <listitem>
+      <para>
+       For all source server non-template databases create subscriptions for
+       databases with the same names on the target server.

In this construct "all" goes with "source server" but it should go
with "non-template database". How about, "For every non-template
database on the source server, create one subscription on the target
server in the database with the same name."?

+       Subscription names, publication names, and replication slot names are
+       automatically generated. This option cannot be used together with
+       <option>--database</option>, <option>--publication</option>,
+       <option>--replication-slot</option> or <option>--subscription</option>.

... used along with ...

also comma before or .

We should also mention that generated names of replication slots,
publications and subscriptions are used when using this option.

+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><option>-d <replaceable
class="parameter">dbname</replaceable></option></term>
      <term><option>--database=<replaceable
class="parameter">dbname</replaceable></option></term>
@@ -94,9 +130,10 @@ PostgreSQL documentation
       <para>
        The name of the database in which to create a subscription.  Multiple
        databases can be selected by writing multiple <option>-d</option>
-       switches. If <option>-d</option> option is not provided, the database
-       name will be obtained from <option>-P</option> option. If the database
-       name is not specified in either the <option>-d</option> option or
+       switches. This option cannot be used together with
<option>--all</option>.
+       If <option>-d</option> option is not provided, the database name will be
+       obtained from <option>-P</option> option. If the database name is not
+       specified in either the <option>-d</option> option or
        <option>-P</option> option, an error will be reported.

We have to mention -a as well here. Something like "If the database
name is not specified in either the <option>-d</option> option or
<option>-P</option> option, and <option>-a</option> is not provided,
an error will be reported."

+# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
+# and verify the failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--database' => $db1,
+ '--all',
+ ],
+ qr/--database cannot be used with --all/,
+ 'fail if --database is used with --all');


Why does only this test not use --dry-run, but all other such tests
use --dry-run? Either they should all use it or not use it.

+
+# run pg_createsubscriber with '--publication' and '--all' and verify
+# the failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--all',
+ '--publication' => 'pub1',
+ ],
+ qr/--publication cannot be used with --all/,
+ 'fail if --publication is used with --all');
+
+# run pg_createsubscriber with '--replication-slot' and '--all' and
+# verify the failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--replication-slot' => 'replslot1',
+ '--all',
+ ],
+ qr/--replication-slot cannot be used with --all/,
+ 'fail if --replication-slot is used with --all');
+
+# run pg_createsubscriber with '--subscription' and '--all' and
+# verify the failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--all',
+ '--subscription' => 'sub1',
+ ],
+ qr/--subscription cannot be used with --all/,
+ 'fail if --subscription is used with --all');
+

Just to cover all combinations we need to test both scenarios. where
--all comes before an incompatible option and vice versa.

 # Run pg_createsubscriber on node S.  --verbose is used twice
 # to show more information.
 # In passing, also test the --enable-two-phase option
@@ -459,10 +526,93 @@ my $sysid_s = $node_s->safe_psql('postgres',
  'SELECT system_identifier FROM pg_control_system()');
 ok($sysid_p != $sysid_s, 'system identifier was changed');

+# On node P create test tables
+$node_p->safe_psql('postgres', 'CREATE TABLE tbl1 (a text)');
+$node_p->safe_psql($db1, 'CREATE TABLE tbl2 (a text)');
+$node_p->safe_psql($db1, "INSERT INTO tbl2 VALUES('first row')");
+$node_p->safe_psql($db2, 'CREATE TABLE tbl3 (a text)');

$db1 and $db2 already have tables. We can avoid creating new ones here.

+
+# Wait subscriber to catch up
+$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
+$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
+$node_u->wait_for_subscription_sync($node_p, $subnames1[2]);
+

This function just makes sure that initial sync is done. But it
doesn't wait for replication to catch up with current state of the
publisher. It's the latter which would make sure that the last INSERTS
are visible. We should be using wait_for_slot_catchup() on the
publisher.

+# Check result in database 'postgres' of node U
+$result = $node_u->safe_psql('postgres', 'SELECT * FROM tbl1');
+is($result, qq(first row), "logical replication works in database postgres");
+
+# Check result in database $db1 of node U
+$result = $node_u->safe_psql($db1, 'SELECT * FROM tbl2');
+is( $result, qq(first row
+second row),
+ "logical replication works in database $db1");
+
+# Check result in database $db2 of node U
+$result = $node_u->safe_psql($db2, 'SELECT * FROM tbl3');
+is($result, qq(first row), "logical replication works in database $db2");
+

These tests may not always pass if we use wait_for_slot_catchup above.

--
Best Wishes,
Ashutosh Bapat



On Thu, Mar 20, 2025 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 20 Mar 2025 at 10:25, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> >
> > I have created two patches, v16-0001 and v16-0002, to address the
> > performance issue. I conducted performance testing, and here are the
> > results:
> > - The difference in execution time between HEAD and the v15 patch was 53.2%.
> > - After removing the suggested test cases, the difference reduced to
> > 36.43%, showing a significant improvement.
> >
>
> It is still taking quite a while, can we compare with the following
> too see how much extra time it takes:
> a) remove insert and select verification for the all database
> successful tests as all of these are logical replication verification
> which is extensively tested

If the intent is to remove those, I don't think it's a good idea.
Other tests may be testing replication but they don't test the setup
created by --all, which consists of objects generated with names
crafted by utility.

Said that we should also test the behaviour when such objects already
exist. But I guess randomness in generated names is hard to control.

--
Best Wishes,
Ashutosh Bapat



On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
>
>
> +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> +# and verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--database' => $db1,
> + '--all',
> + ],
> + qr/--database cannot be used with --all/,
> + 'fail if --database is used with --all');
>
>
> Why does only this test not use --dry-run, but all other such tests
> use --dry-run? Either they should all use it or not use it.
>

I think this patch is adding a lot of extra tests which I don't think
are required. We should have one positive test in --dry-run mode
similar to ('run pg_createsubscriber without --databases') and
probably verify in the LOG that it has expanded commands for all
databases. Also, for negative tests, one or two tests are sufficient
instead of testing all possible combinations. I don't expect this new
option to add a noticeable testing time.

*  <refsynopsisdiv>
+  <cmdsynopsis>
+   <command>pg_createsubscriber</command>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+   <group choice="plain">
+    <group choice="req">
+     <arg choice="plain"><option>-a</option></arg>
+     <arg choice="plain"><option>--all</option></arg>
+    </group>
+    <group choice="req">
+     <arg choice="plain"><option>-D</option> </arg>
+     <arg choice="plain"><option>--pgdata</option></arg>
+    </group>
+    <replaceable>datadir</replaceable>
+    <group choice="req">
+     <arg choice="plain"><option>-P</option></arg>
+     <arg choice="plain"><option>--publisher-server</option></arg>
+    </group>
+    <replaceable>connstr</replaceable>

Most of this is unrelated to this patch. I suggest making a top-up
patch, we can commit it separately.

--
With Regards,
Amit Kapila.



On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
>
> >
> > The attached patches contain the suggested changes.
> >
>
> I have started reviewing the patches again. Here are some review comments
>
>     <variablelist>
> +    <varlistentry>
> +     <term><option>-a</option></term>
> +     <term><option>--all</option></term>
> +     <listitem>
> +      <para>
> +       For all source server non-template databases create subscriptions for
> +       databases with the same names on the target server.
>
> In this construct "all" goes with "source server" but it should go
> with "non-template database". How about, "For every non-template
> database on the source server, create one subscription on the target
> server in the database with the same name."?
>

Fixed.

> +       Subscription names, publication names, and replication slot names are
> +       automatically generated. This option cannot be used together with
> +       <option>--database</option>, <option>--publication</option>,
> +       <option>--replication-slot</option> or <option>--subscription</option>.
>
> ... used along with ...
>
> also comma before or .
>
> We should also mention that generated names of replication slots,
> publications and subscriptions are used when using this option.
>
> +      </para>
> +     </listitem>
> +    </varlistentry>
> +
>      <varlistentry>
>       <term><option>-d <replaceable
> class="parameter">dbname</replaceable></option></term>
>       <term><option>--database=<replaceable
> class="parameter">dbname</replaceable></option></term>

Fixed.

> @@ -94,9 +130,10 @@ PostgreSQL documentation
>        <para>
>         The name of the database in which to create a subscription.  Multiple
>         databases can be selected by writing multiple <option>-d</option>
> -       switches. If <option>-d</option> option is not provided, the database
> -       name will be obtained from <option>-P</option> option. If the database
> -       name is not specified in either the <option>-d</option> option or
> +       switches. This option cannot be used together with
> <option>--all</option>.
> +       If <option>-d</option> option is not provided, the database name will be
> +       obtained from <option>-P</option> option. If the database name is not
> +       specified in either the <option>-d</option> option or
>         <option>-P</option> option, an error will be reported.
>
> We have to mention -a as well here. Something like "If the database
> name is not specified in either the <option>-d</option> option or
> <option>-P</option> option, and <option>-a</option> is not provided,
> an error will be reported."
>

Fixed.

> +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> +# and verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--database' => $db1,
> + '--all',
> + ],
> + qr/--database cannot be used with --all/,
> + 'fail if --database is used with --all');
>
>
> Why does only this test not use --dry-run, but all other such tests
> use --dry-run? Either they should all use it or not use it.
>
> +
> +# run pg_createsubscriber with '--publication' and '--all' and verify
> +# the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--all',
> + '--publication' => 'pub1',
> + ],
> + qr/--publication cannot be used with --all/,
> + 'fail if --publication is used with --all');
> +
> +# run pg_createsubscriber with '--replication-slot' and '--all' and
> +# verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--replication-slot' => 'replslot1',
> + '--all',
> + ],
> + qr/--replication-slot cannot be used with --all/,
> + 'fail if --replication-slot is used with --all');
> +
> +# run pg_createsubscriber with '--subscription' and '--all' and
> +# verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--all',
> + '--subscription' => 'sub1',
> + ],
> + qr/--subscription cannot be used with --all/,
> + 'fail if --subscription is used with --all');
> +
>
> Just to cover all combinations we need to test both scenarios. where
> --all comes before an incompatible option and vice versa.
>
>  # Run pg_createsubscriber on node S.  --verbose is used twice
>  # to show more information.
>  # In passing, also test the --enable-two-phase option
> @@ -459,10 +526,93 @@ my $sysid_s = $node_s->safe_psql('postgres',
>   'SELECT system_identifier FROM pg_control_system()');
>  ok($sysid_p != $sysid_s, 'system identifier was changed');
>
> +# On node P create test tables
> +$node_p->safe_psql('postgres', 'CREATE TABLE tbl1 (a text)');
> +$node_p->safe_psql($db1, 'CREATE TABLE tbl2 (a text)');
> +$node_p->safe_psql($db1, "INSERT INTO tbl2 VALUES('first row')");
> +$node_p->safe_psql($db2, 'CREATE TABLE tbl3 (a text)');
>
> $db1 and $db2 already have tables. We can avoid creating new ones here.
>

Fixed.

> +
> +# Wait subscriber to catch up
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[2]);
> +
>
> This function just makes sure that initial sync is done. But it
> doesn't wait for replication to catch up with current state of the
> publisher. It's the latter which would make sure that the last INSERTS
> are visible. We should be using wait_for_slot_catchup() on the
> publisher.
>
> +# Check result in database 'postgres' of node U
> +$result = $node_u->safe_psql('postgres', 'SELECT * FROM tbl1');
> +is($result, qq(first row), "logical replication works in database postgres");
> +
> +# Check result in database $db1 of node U
> +$result = $node_u->safe_psql($db1, 'SELECT * FROM tbl2');
> +is( $result, qq(first row
> +second row),
> + "logical replication works in database $db1");
> +
> +# Check result in database $db2 of node U
> +$result = $node_u->safe_psql($db2, 'SELECT * FROM tbl3');
> +is($result, qq(first row), "logical replication works in database $db2");
> +
>
> These tests may not always pass if we use wait_for_slot_catchup above.
>

During the recent testing, I observed that the tests were failing when
using wait_for_slot_catchup(). To address this, I reverted to using
wait_for_subscription_sync(), which was employed previously and has
proven to be more reliable in ensuring test stability.
Please let me know if there are any additional adjustments you would
suggest or if you would like me to investigate further into
wait_for_slot_catchup().

I have created a separate patch for the synopsis of '--all' option as
suggested by Amit at [1]. The attached patch contains the suggested
changes.

[1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.

Attachment
On Thu, Mar 20, 2025 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> > <khannashubham1197@gmail.com> wrote:
> >
> >
> > +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> > +# and verify the failure
> > +command_fails_like(
> > + [
> > + 'pg_createsubscriber',
> > + '--verbose',
> > + '--pgdata' => $node_s->data_dir,
> > + '--publisher-server' => $node_p->connstr($db1),
> > + '--socketdir' => $node_s->host,
> > + '--subscriber-port' => $node_s->port,
> > + '--database' => $db1,
> > + '--all',
> > + ],
> > + qr/--database cannot be used with --all/,
> > + 'fail if --database is used with --all');
> >
> >
> > Why does only this test not use --dry-run, but all other such tests
> > use --dry-run? Either they should all use it or not use it.
> >
>
> I think this patch is adding a lot of extra tests which I don't think
> are required. We should have one positive test in --dry-run mode
> similar to ('run pg_createsubscriber without --databases') and
> probably verify in the LOG that it has expanded commands for all
> databases. Also, for negative tests, one or two tests are sufficient
> instead of testing all possible combinations. I don't expect this new
> option to add a noticeable testing time.
>

As per your suggestions, I have updated the test cases in the v17-0001
patch. The primary tests now focus on a positive test in --dry-run
mode, similar to the one for run pg_createsubscriber without
--databases, along with verification in the log to ensure that
commands for all databases are expanded. Also, I have limited the
negative tests to just one or two representative cases, avoiding
unnecessary combinations.
I have moved the additional test cases to the v17-0002 patch to ensure
that the testing time for the new option remains negligible.

> *  <refsynopsisdiv>
> +  <cmdsynopsis>
> +   <command>pg_createsubscriber</command>
> +   <arg rep="repeat"><replaceable>option</replaceable></arg>
> +   <group choice="plain">
> +    <group choice="req">
> +     <arg choice="plain"><option>-a</option></arg>
> +     <arg choice="plain"><option>--all</option></arg>
> +    </group>
> +    <group choice="req">
> +     <arg choice="plain"><option>-D</option> </arg>
> +     <arg choice="plain"><option>--pgdata</option></arg>
> +    </group>
> +    <replaceable>datadir</replaceable>
> +    <group choice="req">
> +     <arg choice="plain"><option>-P</option></arg>
> +     <arg choice="plain"><option>--publisher-server</option></arg>
> +    </group>
> +    <replaceable>connstr</replaceable>
>
> Most of this is unrelated to this patch. I suggest making a top-up
> patch, we can commit it separately.
>

I have created a separate v17-0003 patch that includes the synopsis
for the '--all' option.

The attached patch at [1] contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjLj0KxVHbxaPZHzudGS1igzDMccFE8LDh4LHNJR_2Aqug%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Thu, Mar 20, 2025 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 20 Mar 2025 at 10:25, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> >
> > I have created two patches, v16-0001 and v16-0002, to address the
> > performance issue. I conducted performance testing, and here are the
> > results:
> > - The difference in execution time between HEAD and the v15 patch was 53.2%.
> > - After removing the suggested test cases, the difference reduced to
> > 36.43%, showing a significant improvement.
> >
>
> It is still taking quite a while, can we compare with the following
> too see how much extra time it takes:
> a) remove insert and select verification for the all database
> successful tests as all of these are logical replication verification
> which is extensively tested b) remove command_fails_like failure tests
> c) remove both of above and see.
>

As suggested by Amit at [1], I have reduced the number of test cases
in the v17-0001 patch to ensure they are concise and focused.
I conducted performance testing on the current HEAD and the latest
v17-0001 patch (available at [2]). Here are the observations based on
five runs:-
- The difference in execution time between HEAD and the v17-0001 patch
was 1.62 seconds.
- This corresponds to a 22.44% difference in performance.

[1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHv8RjLj0KxVHbxaPZHzudGS1igzDMccFE8LDh4LHNJR_2Aqug%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Fri, 21 Mar 2025 at 18:59, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
>
> During the recent testing, I observed that the tests were failing when
> using wait_for_slot_catchup(). To address this, I reverted to using
> wait_for_subscription_sync(), which was employed previously and has
> proven to be more reliable in ensuring test stability.
> Please let me know if there are any additional adjustments you would
> suggest or if you would like me to investigate further into
> wait_for_slot_catchup().
>
> I have created a separate patch for the synopsis of '--all' option as
> suggested by Amit at [1]. The attached patch contains the suggested
> changes.

I believe you added the following because pattern matching is
difficult for the $db1 and $db2 variables having special names:
+# Create a new database on node_p
+$node_p->safe_psql(
+       "postgres", qq(
+       CREATE DATABASE db1;
+));

How about we change it to verify the count of occurrences instead for
this case like below:
# Verify that the required logical replication objects are created. The
# expected count 3 refers to postgres, $db1 and $db2 databases.
is(scalar(() = $stderr =~ /creating publication/g),
3, "verify publications are created for all databases");
is(scalar(() = $stderr =~ /creating the replication slot/g),
3, "verify replication slots are created for all databases");
is(scalar(() = $stderr =~ /creating subscription/g),
3, "verify subscriptions are created for all databases");

If you are ok, you can merge the changes attached.

Regards,
Vignesh

Attachment
On Sat, Mar 22, 2025 at 6:23 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 21 Mar 2025 at 18:59, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> >
> > During the recent testing, I observed that the tests were failing when
> > using wait_for_slot_catchup(). To address this, I reverted to using
> > wait_for_subscription_sync(), which was employed previously and has
> > proven to be more reliable in ensuring test stability.
> > Please let me know if there are any additional adjustments you would
> > suggest or if you would like me to investigate further into
> > wait_for_slot_catchup().
> >
> > I have created a separate patch for the synopsis of '--all' option as
> > suggested by Amit at [1]. The attached patch contains the suggested
> > changes.
>
> I believe you added the following because pattern matching is
> difficult for the $db1 and $db2 variables having special names:
> +# Create a new database on node_p
> +$node_p->safe_psql(
> +       "postgres", qq(
> +       CREATE DATABASE db1;
> +));
>
> How about we change it to verify the count of occurrences instead for
> this case like below:
> # Verify that the required logical replication objects are created. The
> # expected count 3 refers to postgres, $db1 and $db2 databases.
> is(scalar(() = $stderr =~ /creating publication/g),
> 3, "verify publications are created for all databases");
> is(scalar(() = $stderr =~ /creating the replication slot/g),
> 3, "verify replication slots are created for all databases");
> is(scalar(() = $stderr =~ /creating subscription/g),
> 3, "verify subscriptions are created for all databases");
>
> If you are ok, you can merge the changes attached.
>

I agree that verifying the count of occurrences is a more
straightforward and effective approach, especially given the
challenges with pattern matching for $db1 and $db2 variables with
special names. This method simplifies validation and enhances
robustness by explicitly ensuring the expected number of logical
replication objects are created.

I have reviewed and merged the proposed changes into the patch. The
attached patches contain the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
Dear Shubham,

> I have reviewed and merged the proposed changes into the patch. The
> attached patches contain the suggested changes.

Thanks for updating the patch! Few comments:

01.
```
+       /*
+        * Fetch all databases from the source (publisher) if --all is specified.
+        */
+       if (opt.all_dbs)
+               fetch_source_databases(&opt);
+
        if (opt.database_names.head == NULL)
```

I feel we can convert "if" -> "else if" for the opt.database_names.head case,
because fetch_source_databases() ensures databases are listed.

02.
```
+# Set up node U as standby linking to node
```

To confirm: why can we use "U" here?

03.
```
+$node_u->append_conf(
+       'postgresql.conf', qq[
+primary_conninfo = '$pconnstr dbname=postgres'
+]);
```
I think this part is not required. init_from_backup() with has_streaming sets the
primary_conninfo.

04.
```
+# Stop $node_s
+$node_s->stop;
```

The comment does not describe anything. I feel you can say "Stop node S because
it won't be used anymore", or just remove it.

05.
```
+# Drop one of the databases (e.g., $db2)
+$node_p->safe_psql('postgres', "DROP DATABASE \"$db2\"");
```

"e.g." means "for example", so it is not suitable to put here. Please explain why it must be removed.

06.
```
+# Get subscription names
+$result = $node_u->safe_psql(
+       'postgres', qq(
+       SELECT subname FROM pg_subscription WHERE subname ~ '^pg_createsubscriber_'
+));
+my @subnames1 = split("\n", $result);
+
+# Wait subscriber to catch up
+$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
+$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
```

wait_for_subscription_sync() is used to wait until the initial synchronization is
done, so not suitable here. wait_for_catchup() is more appropriate.

07.
Also, the similar part exists on the pre-existing part (wait_for_subscription_sync
was used there, but I feel this may not be correct). How about unifing them like
attached?

08.
```
+# Verify logical replication works for all databases
+# Insert rows on node P
+$node_p->safe_psql('postgres', "INSERT INTO tbl1 VALUES('first row')");
```

This is confusing because 'fourth row' is inserted to $db1 just after it. How
about the 'row in database postgres' or something?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment
On Mon, Mar 24, 2025 at 12:29 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> > I have reviewed and merged the proposed changes into the patch. The
> > attached patches contain the suggested changes.
>
> Thanks for updating the patch! Few comments:
>
> 01.
> ```
> +       /*
> +        * Fetch all databases from the source (publisher) if --all is specified.
> +        */
> +       if (opt.all_dbs)
> +               fetch_source_databases(&opt);
> +
>         if (opt.database_names.head == NULL)
> ```
>
> I feel we can convert "if" -> "else if" for the opt.database_names.head case,
> because fetch_source_databases() ensures databases are listed.
>

Fixed.

> 02.
> ```
> +# Set up node U as standby linking to node
> ```
>
> To confirm: why can we use "U" here?
>

Since node_s and node_t are already used in this test, I have used the
next letter, node_u. Additionally, comments have been added to improve
clarity.

> 03.
> ```
> +$node_u->append_conf(
> +       'postgresql.conf', qq[
> +primary_conninfo = '$pconnstr dbname=postgres'
> +]);
> ```
> I think this part is not required. init_from_backup() with has_streaming sets the
> primary_conninfo.
>

Fixed.

> 04.
> ```
> +# Stop $node_s
> +$node_s->stop;
> ```
>
> The comment does not describe anything. I feel you can say "Stop node S because
> it won't be used anymore", or just remove it.
>

Fixed.

> 05.
> ```
> +# Drop one of the databases (e.g., $db2)
> +$node_p->safe_psql('postgres', "DROP DATABASE \"$db2\"");
> ```
>
> "e.g." means "for example", so it is not suitable to put here. Please explain why it must be removed.
>

Fixed.

> 06.
> ```
> +# Get subscription names
> +$result = $node_u->safe_psql(
> +       'postgres', qq(
> +       SELECT subname FROM pg_subscription WHERE subname ~ '^pg_createsubscriber_'
> +));
> +my @subnames1 = split("\n", $result);
> +
> +# Wait subscriber to catch up
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> ```
>
> wait_for_subscription_sync() is used to wait until the initial synchronization is
> done, so not suitable here. wait_for_catchup() is more appropriate.
>

Fixed.

> 07.
> Also, the similar part exists on the pre-existing part (wait_for_subscription_sync
> was used there, but I feel this may not be correct). How about unifing them like
> attached?
>

Fixed.

> 08.
> ```
> +# Verify logical replication works for all databases
> +# Insert rows on node P
> +$node_p->safe_psql('postgres', "INSERT INTO tbl1 VALUES('first row')");
> ```
>
> This is confusing because 'fourth row' is inserted to $db1 just after it. How
> about the 'row in database postgres' or something?
>

Fixed.

The attached patches contain the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Mon, 24 Mar 2025 at 15:56, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Mon, Mar 24, 2025 at 12:29 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Shubham,
> >
> > > I have reviewed and merged the proposed changes into the patch. The
> > > attached patches contain the suggested changes.
> >
> > Thanks for updating the patch! Few comments:
> >
> > 01.
> > ```
> > +       /*
> > +        * Fetch all databases from the source (publisher) if --all is specified.
> > +        */
> > +       if (opt.all_dbs)
> > +               fetch_source_databases(&opt);
> > +
> >         if (opt.database_names.head == NULL)
> > ```
> >
> > I feel we can convert "if" -> "else if" for the opt.database_names.head case,
> > because fetch_source_databases() ensures databases are listed.
> >
>
> Fixed.
>
> > 02.
> > ```
> > +# Set up node U as standby linking to node
> > ```
> >
> > To confirm: why can we use "U" here?
> >
>
> Since node_s and node_t are already used in this test, I have used the
> next letter, node_u. Additionally, comments have been added to improve
> clarity.
>
> > 03.
> > ```
> > +$node_u->append_conf(
> > +       'postgresql.conf', qq[
> > +primary_conninfo = '$pconnstr dbname=postgres'
> > +]);
> > ```
> > I think this part is not required. init_from_backup() with has_streaming sets the
> > primary_conninfo.
> >
>
> Fixed.
>
> > 04.
> > ```
> > +# Stop $node_s
> > +$node_s->stop;
> > ```
> >
> > The comment does not describe anything. I feel you can say "Stop node S because
> > it won't be used anymore", or just remove it.
> >
>
> Fixed.
>
> > 05.
> > ```
> > +# Drop one of the databases (e.g., $db2)
> > +$node_p->safe_psql('postgres', "DROP DATABASE \"$db2\"");
> > ```
> >
> > "e.g." means "for example", so it is not suitable to put here. Please explain why it must be removed.
> >
>
> Fixed.
>
> > 06.
> > ```
> > +# Get subscription names
> > +$result = $node_u->safe_psql(
> > +       'postgres', qq(
> > +       SELECT subname FROM pg_subscription WHERE subname ~ '^pg_createsubscriber_'
> > +));
> > +my @subnames1 = split("\n", $result);
> > +
> > +# Wait subscriber to catch up
> > +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> > +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> > ```
> >
> > wait_for_subscription_sync() is used to wait until the initial synchronization is
> > done, so not suitable here. wait_for_catchup() is more appropriate.
> >
>
> Fixed.
>
> > 07.
> > Also, the similar part exists on the pre-existing part (wait_for_subscription_sync
> > was used there, but I feel this may not be correct). How about unifing them like
> > attached?
> >
>
> Fixed.
>
> > 08.
> > ```
> > +# Verify logical replication works for all databases
> > +# Insert rows on node P
> > +$node_p->safe_psql('postgres', "INSERT INTO tbl1 VALUES('first row')");
> > ```
> >
> > This is confusing because 'fourth row' is inserted to $db1 just after it. How
> > about the 'row in database postgres' or something?
> >
>
> Fixed.
>
> The attached patches contain the suggested changes.
>

I have reviewed the v19-0001 patch I have a comment:

In pg_createsubscriber.sgml, this line seems little odd:

+       obtained from <option>-P</option> option. If the database name is not
+       specified in either the <option>-d</option> option, or the
+       <option>-P</option> option, and <option>-a</option>
+       an error will be reported.

Should we change the sentence to something like:

+       obtained from <option>-P</option> option. If the database name is not
+       specified in either the <option>-d</option> option, or the
+       <option>-P</option> option, and <option>-a</option> option is not
+       specified, an error will be reported.

Thanks,
Shlok Kyal



On Fri, Mar 21, 2025 at 6:59 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> > <khannashubham1197@gmail.com> wrote:
> >
> > >
> > > The attached patches contain the suggested changes.
> > >
> >
> > I have started reviewing the patches again. Here are some review comments
> >
> >     <variablelist>
> > +    <varlistentry>
> > +     <term><option>-a</option></term>
> > +     <term><option>--all</option></term>
> > +     <listitem>
> > +      <para>
> > +       For all source server non-template databases create subscriptions for
> > +       databases with the same names on the target server.
> >
> > In this construct "all" goes with "source server" but it should go
> > with "non-template database". How about, "For every non-template
> > database on the source server, create one subscription on the target
> > server in the database with the same name."?
> >
>
> Fixed.
>
> > +       Subscription names, publication names, and replication slot names are
> > +       automatically generated. This option cannot be used together with
> > +       <option>--database</option>, <option>--publication</option>,
> > +       <option>--replication-slot</option> or <option>--subscription</option>.
> >
> > ... used along with ...
> >
> > also comma before or .
> >
> > We should also mention that generated names of replication slots,
> > publications and subscriptions are used when using this option.
> >
> > +      </para>
> > +     </listitem>
> > +    </varlistentry>
> > +
> >      <varlistentry>
> >       <term><option>-d <replaceable
> > class="parameter">dbname</replaceable></option></term>
> >       <term><option>--database=<replaceable
> > class="parameter">dbname</replaceable></option></term>
>
> Fixed.
>
> > @@ -94,9 +130,10 @@ PostgreSQL documentation
> >        <para>
> >         The name of the database in which to create a subscription.  Multiple
> >         databases can be selected by writing multiple <option>-d</option>
> > -       switches. If <option>-d</option> option is not provided, the database
> > -       name will be obtained from <option>-P</option> option. If the database
> > -       name is not specified in either the <option>-d</option> option or
> > +       switches. This option cannot be used together with
> > <option>--all</option>.
> > +       If <option>-d</option> option is not provided, the database name will be
> > +       obtained from <option>-P</option> option. If the database name is not
> > +       specified in either the <option>-d</option> option or
> >         <option>-P</option> option, an error will be reported.
> >
> > We have to mention -a as well here. Something like "If the database
> > name is not specified in either the <option>-d</option> option or
> > <option>-P</option> option, and <option>-a</option> is not provided,
> > an error will be reported."
> >
>
> Fixed.
>
> > +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> > +# and verify the failure
> > +command_fails_like(
> > + [
> > + 'pg_createsubscriber',
> > + '--verbose',
> > + '--pgdata' => $node_s->data_dir,
> > + '--publisher-server' => $node_p->connstr($db1),
> > + '--socketdir' => $node_s->host,
> > + '--subscriber-port' => $node_s->port,
> > + '--database' => $db1,
> > + '--all',
> > + ],
> > + qr/--database cannot be used with --all/,
> > + 'fail if --database is used with --all');
> >
> >
> > Why does only this test not use --dry-run, but all other such tests
> > use --dry-run? Either they should all use it or not use it.
> >
> > +
> > +# run pg_createsubscriber with '--publication' and '--all' and verify
> > +# the failure
> > +command_fails_like(
> > + [
> > + 'pg_createsubscriber',
> > + '--verbose',
> > + '--dry-run',
> > + '--pgdata' => $node_s->data_dir,
> > + '--publisher-server' => $node_p->connstr($db1),
> > + '--socketdir' => $node_s->host,
> > + '--subscriber-port' => $node_s->port,
> > + '--all',
> > + '--publication' => 'pub1',
> > + ],
> > + qr/--publication cannot be used with --all/,
> > + 'fail if --publication is used with --all');
> > +
> > +# run pg_createsubscriber with '--replication-slot' and '--all' and
> > +# verify the failure
> > +command_fails_like(
> > + [
> > + 'pg_createsubscriber',
> > + '--verbose',
> > + '--dry-run',
> > + '--pgdata' => $node_s->data_dir,
> > + '--publisher-server' => $node_p->connstr($db1),
> > + '--socketdir' => $node_s->host,
> > + '--subscriber-port' => $node_s->port,
> > + '--replication-slot' => 'replslot1',
> > + '--all',
> > + ],
> > + qr/--replication-slot cannot be used with --all/,
> > + 'fail if --replication-slot is used with --all');
> > +
> > +# run pg_createsubscriber with '--subscription' and '--all' and
> > +# verify the failure
> > +command_fails_like(
> > + [
> > + 'pg_createsubscriber',
> > + '--verbose',
> > + '--dry-run',
> > + '--pgdata' => $node_s->data_dir,
> > + '--publisher-server' => $node_p->connstr($db1),
> > + '--socketdir' => $node_s->host,
> > + '--subscriber-port' => $node_s->port,
> > + '--all',
> > + '--subscription' => 'sub1',
> > + ],
> > + qr/--subscription cannot be used with --all/,
> > + 'fail if --subscription is used with --all');
> > +
> >
> > Just to cover all combinations we need to test both scenarios. where
> > --all comes before an incompatible option and vice versa.
> >
> >  # Run pg_createsubscriber on node S.  --verbose is used twice
> >  # to show more information.
> >  # In passing, also test the --enable-two-phase option
> > @@ -459,10 +526,93 @@ my $sysid_s = $node_s->safe_psql('postgres',
> >   'SELECT system_identifier FROM pg_control_system()');
> >  ok($sysid_p != $sysid_s, 'system identifier was changed');
> >
> > +# On node P create test tables
> > +$node_p->safe_psql('postgres', 'CREATE TABLE tbl1 (a text)');
> > +$node_p->safe_psql($db1, 'CREATE TABLE tbl2 (a text)');
> > +$node_p->safe_psql($db1, "INSERT INTO tbl2 VALUES('first row')");
> > +$node_p->safe_psql($db2, 'CREATE TABLE tbl3 (a text)');
> >
> > $db1 and $db2 already have tables. We can avoid creating new ones here.
> >
>
> Fixed.
>
> > +
> > +# Wait subscriber to catch up
> > +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> > +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> > +$node_u->wait_for_subscription_sync($node_p, $subnames1[2]);
> > +
> >
> > This function just makes sure that initial sync is done. But it
> > doesn't wait for replication to catch up with current state of the
> > publisher. It's the latter which would make sure that the last INSERTS
> > are visible. We should be using wait_for_slot_catchup() on the
> > publisher.
> >
> > +# Check result in database 'postgres' of node U
> > +$result = $node_u->safe_psql('postgres', 'SELECT * FROM tbl1');
> > +is($result, qq(first row), "logical replication works in database postgres");
> > +
> > +# Check result in database $db1 of node U
> > +$result = $node_u->safe_psql($db1, 'SELECT * FROM tbl2');
> > +is( $result, qq(first row
> > +second row),
> > + "logical replication works in database $db1");
> > +
> > +# Check result in database $db2 of node U
> > +$result = $node_u->safe_psql($db2, 'SELECT * FROM tbl3');
> > +is($result, qq(first row), "logical replication works in database $db2");
> > +
> >
> > These tests may not always pass if we use wait_for_slot_catchup above.
> >
>
> During the recent testing, I observed that the tests were failing when
> using wait_for_slot_catchup(). To address this, I reverted to using
> wait_for_subscription_sync(), which was employed previously and has
> proven to be more reliable in ensuring test stability.
> Please let me know if there are any additional adjustments you would
> suggest or if you would like me to investigate further into
> wait_for_slot_catchup().

wait_for_subscription_sync() calls wait_for_catchup() which actually
waits for a given WAL sender to pass given LSN. So it's working fine.
But wait_for_subscription_sync() is used to make sure that the initial
data sync is completed (at least in the cases that I looked at) not
that the regular replication has caught up. And it's doing more work
that required. I see other tests using wait_for_catchup() for this
purpose. Should we use that with appropriate arguments?

--
Best Wishes,
Ashutosh Bapat



On Mon, Mar 24, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 24 Mar 2025 at 15:56, Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > On Mon, Mar 24, 2025 at 12:29 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear Shubham,
> > >
> > > > I have reviewed and merged the proposed changes into the patch. The
> > > > attached patches contain the suggested changes.
> > >
> > > Thanks for updating the patch! Few comments:
> > >
> > > 01.
> > > ```
> > > +       /*
> > > +        * Fetch all databases from the source (publisher) if --all is specified.
> > > +        */
> > > +       if (opt.all_dbs)
> > > +               fetch_source_databases(&opt);
> > > +
> > >         if (opt.database_names.head == NULL)
> > > ```
> > >
> > > I feel we can convert "if" -> "else if" for the opt.database_names.head case,
> > > because fetch_source_databases() ensures databases are listed.
> > >
> >
> > Fixed.
> >
> > > 02.
> > > ```
> > > +# Set up node U as standby linking to node
> > > ```
> > >
> > > To confirm: why can we use "U" here?
> > >
> >
> > Since node_s and node_t are already used in this test, I have used the
> > next letter, node_u. Additionally, comments have been added to improve
> > clarity.
> >
> > > 03.
> > > ```
> > > +$node_u->append_conf(
> > > +       'postgresql.conf', qq[
> > > +primary_conninfo = '$pconnstr dbname=postgres'
> > > +]);
> > > ```
> > > I think this part is not required. init_from_backup() with has_streaming sets the
> > > primary_conninfo.
> > >
> >
> > Fixed.
> >
> > > 04.
> > > ```
> > > +# Stop $node_s
> > > +$node_s->stop;
> > > ```
> > >
> > > The comment does not describe anything. I feel you can say "Stop node S because
> > > it won't be used anymore", or just remove it.
> > >
> >
> > Fixed.
> >
> > > 05.
> > > ```
> > > +# Drop one of the databases (e.g., $db2)
> > > +$node_p->safe_psql('postgres', "DROP DATABASE \"$db2\"");
> > > ```
> > >
> > > "e.g." means "for example", so it is not suitable to put here. Please explain why it must be removed.
> > >
> >
> > Fixed.
> >
> > > 06.
> > > ```
> > > +# Get subscription names
> > > +$result = $node_u->safe_psql(
> > > +       'postgres', qq(
> > > +       SELECT subname FROM pg_subscription WHERE subname ~ '^pg_createsubscriber_'
> > > +));
> > > +my @subnames1 = split("\n", $result);
> > > +
> > > +# Wait subscriber to catch up
> > > +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> > > +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> > > ```
> > >
> > > wait_for_subscription_sync() is used to wait until the initial synchronization is
> > > done, so not suitable here. wait_for_catchup() is more appropriate.
> > >
> >
> > Fixed.
> >
> > > 07.
> > > Also, the similar part exists on the pre-existing part (wait_for_subscription_sync
> > > was used there, but I feel this may not be correct). How about unifing them like
> > > attached?
> > >
> >
> > Fixed.
> >
> > > 08.
> > > ```
> > > +# Verify logical replication works for all databases
> > > +# Insert rows on node P
> > > +$node_p->safe_psql('postgres', "INSERT INTO tbl1 VALUES('first row')");
> > > ```
> > >
> > > This is confusing because 'fourth row' is inserted to $db1 just after it. How
> > > about the 'row in database postgres' or something?
> > >
> >
> > Fixed.
> >
> > The attached patches contain the suggested changes.
> >
>
> I have reviewed the v19-0001 patch I have a comment:
>
> In pg_createsubscriber.sgml, this line seems little odd:
>
> +       obtained from <option>-P</option> option. If the database name is not
> +       specified in either the <option>-d</option> option, or the
> +       <option>-P</option> option, and <option>-a</option>
> +       an error will be reported.
>
> Should we change the sentence to something like:
>
> +       obtained from <option>-P</option> option. If the database name is not
> +       specified in either the <option>-d</option> option, or the
> +       <option>-P</option> option, and <option>-a</option> option is not
> +       specified, an error will be reported.
>

Fixed.

The attached patches contain the suggested changes.

Thanks and regards,
Shubham Khanna.

Attachment
On Mon, Mar 24, 2025 at 6:08 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Mar 21, 2025 at 6:59 PM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
> > On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> > > <khannashubham1197@gmail.com> wrote:
> > >
> > > >
> > > > The attached patches contain the suggested changes.
> > > >
> > >
> > > I have started reviewing the patches again. Here are some review comments
> > >
> > >     <variablelist>
> > > +    <varlistentry>
> > > +     <term><option>-a</option></term>
> > > +     <term><option>--all</option></term>
> > > +     <listitem>
> > > +      <para>
> > > +       For all source server non-template databases create subscriptions for
> > > +       databases with the same names on the target server.
> > >
> > > In this construct "all" goes with "source server" but it should go
> > > with "non-template database". How about, "For every non-template
> > > database on the source server, create one subscription on the target
> > > server in the database with the same name."?
> > >
> >
> > Fixed.
> >
> > > +       Subscription names, publication names, and replication slot names are
> > > +       automatically generated. This option cannot be used together with
> > > +       <option>--database</option>, <option>--publication</option>,
> > > +       <option>--replication-slot</option> or <option>--subscription</option>.
> > >
> > > ... used along with ...
> > >
> > > also comma before or .
> > >
> > > We should also mention that generated names of replication slots,
> > > publications and subscriptions are used when using this option.
> > >
> > > +      </para>
> > > +     </listitem>
> > > +    </varlistentry>
> > > +
> > >      <varlistentry>
> > >       <term><option>-d <replaceable
> > > class="parameter">dbname</replaceable></option></term>
> > >       <term><option>--database=<replaceable
> > > class="parameter">dbname</replaceable></option></term>
> >
> > Fixed.
> >
> > > @@ -94,9 +130,10 @@ PostgreSQL documentation
> > >        <para>
> > >         The name of the database in which to create a subscription.  Multiple
> > >         databases can be selected by writing multiple <option>-d</option>
> > > -       switches. If <option>-d</option> option is not provided, the database
> > > -       name will be obtained from <option>-P</option> option. If the database
> > > -       name is not specified in either the <option>-d</option> option or
> > > +       switches. This option cannot be used together with
> > > <option>--all</option>.
> > > +       If <option>-d</option> option is not provided, the database name will be
> > > +       obtained from <option>-P</option> option. If the database name is not
> > > +       specified in either the <option>-d</option> option or
> > >         <option>-P</option> option, an error will be reported.
> > >
> > > We have to mention -a as well here. Something like "If the database
> > > name is not specified in either the <option>-d</option> option or
> > > <option>-P</option> option, and <option>-a</option> is not provided,
> > > an error will be reported."
> > >
> >
> > Fixed.
> >
> > > +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> > > +# and verify the failure
> > > +command_fails_like(
> > > + [
> > > + 'pg_createsubscriber',
> > > + '--verbose',
> > > + '--pgdata' => $node_s->data_dir,
> > > + '--publisher-server' => $node_p->connstr($db1),
> > > + '--socketdir' => $node_s->host,
> > > + '--subscriber-port' => $node_s->port,
> > > + '--database' => $db1,
> > > + '--all',
> > > + ],
> > > + qr/--database cannot be used with --all/,
> > > + 'fail if --database is used with --all');
> > >
> > >
> > > Why does only this test not use --dry-run, but all other such tests
> > > use --dry-run? Either they should all use it or not use it.
> > >
> > > +
> > > +# run pg_createsubscriber with '--publication' and '--all' and verify
> > > +# the failure
> > > +command_fails_like(
> > > + [
> > > + 'pg_createsubscriber',
> > > + '--verbose',
> > > + '--dry-run',
> > > + '--pgdata' => $node_s->data_dir,
> > > + '--publisher-server' => $node_p->connstr($db1),
> > > + '--socketdir' => $node_s->host,
> > > + '--subscriber-port' => $node_s->port,
> > > + '--all',
> > > + '--publication' => 'pub1',
> > > + ],
> > > + qr/--publication cannot be used with --all/,
> > > + 'fail if --publication is used with --all');
> > > +
> > > +# run pg_createsubscriber with '--replication-slot' and '--all' and
> > > +# verify the failure
> > > +command_fails_like(
> > > + [
> > > + 'pg_createsubscriber',
> > > + '--verbose',
> > > + '--dry-run',
> > > + '--pgdata' => $node_s->data_dir,
> > > + '--publisher-server' => $node_p->connstr($db1),
> > > + '--socketdir' => $node_s->host,
> > > + '--subscriber-port' => $node_s->port,
> > > + '--replication-slot' => 'replslot1',
> > > + '--all',
> > > + ],
> > > + qr/--replication-slot cannot be used with --all/,
> > > + 'fail if --replication-slot is used with --all');
> > > +
> > > +# run pg_createsubscriber with '--subscription' and '--all' and
> > > +# verify the failure
> > > +command_fails_like(
> > > + [
> > > + 'pg_createsubscriber',
> > > + '--verbose',
> > > + '--dry-run',
> > > + '--pgdata' => $node_s->data_dir,
> > > + '--publisher-server' => $node_p->connstr($db1),
> > > + '--socketdir' => $node_s->host,
> > > + '--subscriber-port' => $node_s->port,
> > > + '--all',
> > > + '--subscription' => 'sub1',
> > > + ],
> > > + qr/--subscription cannot be used with --all/,
> > > + 'fail if --subscription is used with --all');
> > > +
> > >
> > > Just to cover all combinations we need to test both scenarios. where
> > > --all comes before an incompatible option and vice versa.
> > >
> > >  # Run pg_createsubscriber on node S.  --verbose is used twice
> > >  # to show more information.
> > >  # In passing, also test the --enable-two-phase option
> > > @@ -459,10 +526,93 @@ my $sysid_s = $node_s->safe_psql('postgres',
> > >   'SELECT system_identifier FROM pg_control_system()');
> > >  ok($sysid_p != $sysid_s, 'system identifier was changed');
> > >
> > > +# On node P create test tables
> > > +$node_p->safe_psql('postgres', 'CREATE TABLE tbl1 (a text)');
> > > +$node_p->safe_psql($db1, 'CREATE TABLE tbl2 (a text)');
> > > +$node_p->safe_psql($db1, "INSERT INTO tbl2 VALUES('first row')");
> > > +$node_p->safe_psql($db2, 'CREATE TABLE tbl3 (a text)');
> > >
> > > $db1 and $db2 already have tables. We can avoid creating new ones here.
> > >
> >
> > Fixed.
> >
> > > +
> > > +# Wait subscriber to catch up
> > > +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> > > +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> > > +$node_u->wait_for_subscription_sync($node_p, $subnames1[2]);
> > > +
> > >
> > > This function just makes sure that initial sync is done. But it
> > > doesn't wait for replication to catch up with current state of the
> > > publisher. It's the latter which would make sure that the last INSERTS
> > > are visible. We should be using wait_for_slot_catchup() on the
> > > publisher.
> > >
> > > +# Check result in database 'postgres' of node U
> > > +$result = $node_u->safe_psql('postgres', 'SELECT * FROM tbl1');
> > > +is($result, qq(first row), "logical replication works in database postgres");
> > > +
> > > +# Check result in database $db1 of node U
> > > +$result = $node_u->safe_psql($db1, 'SELECT * FROM tbl2');
> > > +is( $result, qq(first row
> > > +second row),
> > > + "logical replication works in database $db1");
> > > +
> > > +# Check result in database $db2 of node U
> > > +$result = $node_u->safe_psql($db2, 'SELECT * FROM tbl3');
> > > +is($result, qq(first row), "logical replication works in database $db2");
> > > +
> > >
> > > These tests may not always pass if we use wait_for_slot_catchup above.
> > >
> >
> > During the recent testing, I observed that the tests were failing when
> > using wait_for_slot_catchup(). To address this, I reverted to using
> > wait_for_subscription_sync(), which was employed previously and has
> > proven to be more reliable in ensuring test stability.
> > Please let me know if there are any additional adjustments you would
> > suggest or if you would like me to investigate further into
> > wait_for_slot_catchup().
>
> wait_for_subscription_sync() calls wait_for_catchup() which actually
> waits for a given WAL sender to pass given LSN. So it's working fine.
> But wait_for_subscription_sync() is used to make sure that the initial
> data sync is completed (at least in the cases that I looked at) not
> that the regular replication has caught up. And it's doing more work
> that required. I see other tests using wait_for_catchup() for this
> purpose. Should we use that with appropriate arguments?
>
> --

The v20-0003 patch attached at [1] contains the suggested changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjJ4AQCYRanC-9rX9xjZnzH0LWgFyS6Ve-1-gHNG-i5%3DAQ%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



Dear Shubham,

> The attached patches contain the suggested changes.

Thanks for updating the patch. I reviewed only 0001 because they would be committed separately.
Few comments:

01.
```
+       For every non-template database on the source server, create one
+       subscription on the target server in the database with the same name.
```

It is quite confusing for me; We do not have to describe users that this command
checks databases of the source server. How about something like:
"Create one subscription per all non-template databases on the target server."

02.
```
+# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
+# and verify the failure
+command_fails_like(
+    [
+        'pg_createsubscriber',
+        '--verbose',
+        '--pgdata' => $node_s->data_dir,
+        '--publisher-server' => $node_p->connstr($db1),
+        '--socketdir' => $node_s->host,
+        '--subscriber-port' => $node_s->port,
+        '--database' => $db1,
+        '--all',
+    ],
+    qr/--database cannot be used with --all/,
+    'fail if --database is used with --all');
+
+# run pg_createsubscriber with '--publication' and '--all' and verify
+# the failure
+command_fails_like(
+    [
+        'pg_createsubscriber',
+        '--verbose',
+        '--dry-run',
+        '--pgdata' => $node_s->data_dir,
+        '--publisher-server' => $node_p->connstr($db1),
+        '--socketdir' => $node_s->host,
+        '--subscriber-port' => $node_s->port,
+        '--all',
+        '--publication' => 'pub1',
+    ],
+    qr/--publication cannot be used with --all/,
+    'fail if --publication is used with --all');
```

You seemed to move most of validation checks to 0002. Can you tell me a reason
why they are remained?

03.
```
+# Verify that the required logical replication objects are created. The
+# expected count 3 refers to postgres, $db1 and $db2 databases.
```

Hmm, but since we did a dry-run, any objects are not created, right?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Tue, Mar 25, 2025 at 3:22 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> > The attached patches contain the suggested changes.
>
> Thanks for updating the patch. I reviewed only 0001 because they would be committed separately.
> Few comments:
>
> 01.
> ```
> +       For every non-template database on the source server, create one
> +       subscription on the target server in the database with the same name.
> ```
>
> It is quite confusing for me; We do not have to describe users that this command
> checks databases of the source server. How about something like:
> "Create one subscription per all non-template databases on the target server."
>

Fixed.

> 02.
> ```
> +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> +# and verify the failure
> +command_fails_like(
> +       [
> +               'pg_createsubscriber',
> +               '--verbose',
> +               '--pgdata' => $node_s->data_dir,
> +               '--publisher-server' => $node_p->connstr($db1),
> +               '--socketdir' => $node_s->host,
> +               '--subscriber-port' => $node_s->port,
> +               '--database' => $db1,
> +               '--all',
> +       ],
> +       qr/--database cannot be used with --all/,
> +       'fail if --database is used with --all');
> +
> +# run pg_createsubscriber with '--publication' and '--all' and verify
> +# the failure
> +command_fails_like(
> +       [
> +               'pg_createsubscriber',
> +               '--verbose',
> +               '--dry-run',
> +               '--pgdata' => $node_s->data_dir,
> +               '--publisher-server' => $node_p->connstr($db1),
> +               '--socketdir' => $node_s->host,
> +               '--subscriber-port' => $node_s->port,
> +               '--all',
> +               '--publication' => 'pub1',
> +       ],
> +       qr/--publication cannot be used with --all/,
> +       'fail if --publication is used with --all');
> ```
>
> You seemed to move most of validation checks to 0002. Can you tell me a reason
> why they are remained?
>

As per Amit's suggestions at [1], I moved most of the validation test
cases to the separate 0003 patch. However, these particular checks
were retained in the main patch because they ensure the fundamental
validations for incompatible options (--database and --publication
with --all) are covered early on.

> 03.
> ```
> +# Verify that the required logical replication objects are created. The
> +# expected count 3 refers to postgres, $db1 and $db2 databases.
> ```
>
> Hmm, but since we did a dry-run, any objects are not created, right?
>

Fixed.

The attached patches contain the suggested changes.
[1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.



On Tue, Mar 25, 2025 at 4:28 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:

>
> The attached patches contain the suggested changes.
> [1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com

Forgot to attach patches?

--
Best Wishes,
Ashutosh Bapat



On Tue, Mar 25, 2025 at 4:31 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Mar 25, 2025 at 4:28 PM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
>
> >
> > The attached patches contain the suggested changes.
> > [1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com
>
> Forgot to attach patches?
>
> --

Apologies for the oversight. I have attached the patches now. Please
find them included here.

Thanks and regards,
Shubham Khanna.

Attachment
On Tue, Mar 25, 2025 at 3:22 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Shubham,
>
> > The attached patches contain the suggested changes.
>
> Thanks for updating the patch. I reviewed only 0001 because they would be committed separately.
> Few comments:
>
> 01.
> ```
> +       For every non-template database on the source server, create one
> +       subscription on the target server in the database with the same name.
> ```
>
> It is quite confusing for me; We do not have to describe users that this command
> checks databases of the source server. How about something like:
> "Create one subscription per all non-template databases on the target server."
>

The new description doesn't mention the link between the source and
target database. And I think it's incorrect. Not all databases on the
target server will receive a subscription. Only those which have the
same name as a database on the source server. Am I correct? I prefer
the previous wording, even if it's a bit complex, over a simpler but
incorrect description.

--
Best Wishes,
Ashutosh Bapat



Comments on 0001
@@ -87,6 +87,21 @@ PostgreSQL documentation
command-line arguments:
<variablelist>
+ <varlistentry>
+ <term><option>-a</option></term>
+ <term><option>--all</option></term>
+ <listitem>
+ <para>
+ Create one subscription per all non-template databases on the target
+ server. Automatically generated names for subscriptions, publications,

Already provided comment on this part.

+/*
+ * If --all is specified, fetch a list of all user-created databases from the
+ * source server.
+ */

The prologue of this function shouldn't mention --all and --database
option as it doesn't deal with those option itself. I would just say
"Fetch a list of all not-template databases from the source server.

/* Any non-option arguments? */
if (optind < argc)
{
@@ -2202,14 +2280,20 @@ main(int argc, char **argv)
pg_log_info("validating subscriber connection string");
sub_base_conninfo = get_sub_conninfo(&opt);
- if (opt.database_names.head == NULL)
+ /*
+ * Fetch all databases from the source (publisher) if --all is specified.

Add the relevant parts from fetch_source_databases() prologue here.
That is, just add, "This is treated as if the user specified multiple
--database options, one for each source database.".

+ */
+ if (opt.all_dbs)
+ fetch_source_databases(&opt);
+

extra line, not needed

This looks mostly ready except the test changes. I believe when
committing, we are going to squash all three into a single commit. Is
that correct?

On Tue, Mar 25, 2025 at 4:37 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> On Tue, Mar 25, 2025 at 4:31 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Tue, Mar 25, 2025 at 4:28 PM Shubham Khanna
> > <khannashubham1197@gmail.com> wrote:
> >
> > >
> > > The attached patches contain the suggested changes.
> > > [1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com
> >
> > Forgot to attach patches?
> >
> > --
>
> Apologies for the oversight. I have attached the patches now. Please
> find them included here.
>
> Thanks and regards,
> Shubham Khanna.



--
Best Wishes,
Ashutosh Bapat



On Tue, Mar 25, 2025 at 5:08 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> This looks mostly ready except the test changes. I believe when
> committing, we are going to squash all three into a single commit. Is
> that correct?
>

I would not prefer to commit 0003 as it is primarily because of test
+# run pg_createsubscriber with '--all' option without '--dry-run'. I
am not so sure whether it is really worth adding test cycles for this
option except for dry-run mode tests but we can discuss after
committing the core patch.

--
With Regards,
Amit Kapila.



On Tue, Mar 25, 2025 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 25, 2025 at 5:08 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > This looks mostly ready except the test changes. I believe when
> > committing, we are going to squash all three into a single commit. Is
> > that correct?
> >
>
> I would not prefer to commit 0003 as it is primarily because of test
> +# run pg_createsubscriber with '--all' option without '--dry-run'. I
> am not so sure whether it is really worth adding test cycles for this
> option except for dry-run mode tests but we can discuss after
> committing the core patch.

I am worried that without that test, we won't catch bugs in creating
slots, publications and subscriptions and thus causing problems in a
setup with --all. But as long as we address that concern I am ok to
defer it after committing the core patch.

--
Best Wishes,
Ashutosh Bapat



On Thu, Mar 20, 2025 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> *  <refsynopsisdiv>
> +  <cmdsynopsis>
> +   <command>pg_createsubscriber</command>
> +   <arg rep="repeat"><replaceable>option</replaceable></arg>
> +   <group choice="plain">
> +    <group choice="req">
> +     <arg choice="plain"><option>-a</option></arg>
> +     <arg choice="plain"><option>--all</option></arg>
> +    </group>
> +    <group choice="req">
> +     <arg choice="plain"><option>-D</option> </arg>
> +     <arg choice="plain"><option>--pgdata</option></arg>
> +    </group>
> +    <replaceable>datadir</replaceable>
> +    <group choice="req">
> +     <arg choice="plain"><option>-P</option></arg>
> +     <arg choice="plain"><option>--publisher-server</option></arg>
> +    </group>
> +    <replaceable>connstr</replaceable>
>
> Most of this is unrelated to this patch. I suggest making a top-up
> patch, we can commit it separately.

Isn't this documenting -a/--all option in synopsis. Why do you think
it's unrelated?

--
Best Wishes,
Ashutosh Bapat



Dear Ashutosh,

> The new description doesn't mention the link between the source and
> target database.

Yes, I intentionally removed.

> And I think it's incorrect. Not all databases on the
> target server will receive a subscription. Only those which have the
> same name as a database on the source server. Am I correct?

I assumed the point that the target is the streaming standby and all changes
between them would be eventually resolved. IIUC any differences between instances
would be resolved or the command would raise an ERROR. Thus, users do not have to
consider the differences of nodes so that we can simplify the description.

Analysis
=======
If some databases are missing on the source, it would not be listed as target.
pg_createsubscriber can wait until all changes are replicated before the promotion
so these databases would be removed even on target. OK, it could work.
If some databases exist only on the source, it would be listed. This command may
fail if database written in dbinfo[0].subconninfo does not exist on the target.
Otherwise they would be eventually replicated. Looks OK.

> I prefer
> the previous wording, even if it's a bit complex, over a simpler but
> incorrect description.

Definitely, and I know description in v20 is correct. Let's keep current
one if my idea above is wrong.

Best regards,
Hayato Kuroda
FUJITSU LIMITED



On Tue, Mar 25, 2025, at 8:07 AM, Shubham Khanna wrote:
Apologies for the oversight. I have attached the patches now. Please
find them included here.

I started looking at this patch. When I started reading the commit message, I
didn't understand why the options that provide names to objects are
incompatible with --all option. I agree that --all and --database should be
incompatible but the others shouldn't. We already have a check saying that the
number of specified objects cannot be different from the number of databases.
Why don't rely on it instead of forbidding these options?

    /* Number of object names must match number of databases */
    if (num_pubs > 0 && num_pubs != num_dbs)
    {
        pg_log_error("wrong number of publication names specified");
        pg_log_error_detail("The number of specified publication names (%d) must match the number of specified database names (%d).",
                            num_pubs, num_dbs);
        exit(1);
    }

The following documentation is inaccurate since it doesn't say you should be
allowed to connect to the database too.

+      <para>
+       Create one subscription per all non-template databases on the target
+       server. Automatically generated names for subscriptions, publications,

My suggestion is:

Create one subscription per database on the target server. Exceptions are
template databases and databases that don't allow connections.

You are mixing short (-a) and long option (--all) on the same paragraph. It
might confuse the reader.

+       switches. This option cannot be used together with <option>--all</option>.
+       If <option>-d</option> option is not provided, the database name will be
+       obtained from <option>-P</option> option. If the database name is not
+       specified in either the <option>-d</option> option, or the
+       <option>-P</option> option, and <option>-a</option> option is not
+       specified, an error will be reported.

Since there is only short options, you should also use it.

+   bool        all_dbs;        /* --all option was specified */
    SimpleStringList objecttypes_to_remove; /* list of object types to remove */

This comment doesn't follow the same pattern as the other members. It is using
a sentence. Maybe '--all option' but it would be different from the last added
option: enable-two-phase.

I'm already thinking about translation so it sounds better if you rephrase this
description. Even if it is not precise (but that's what it is expected since if
you cannot connect to a database, it won't be possible to setup a logical
replication for it.)

+   printf(_("  -a, --all                       create subscriptions for all non-template source databases\n"));

Suggestion:

  create subscriptions for all databases except template databases

The following comment is not accurate. The postgres database is not created by
user and will be fetched.

+/*
+ * If --all is specified, fetch a list of all user-created databases from the
+ * source server. Internally, this is treated as if the user specified multiple
+ * --database options, one for each source database.
+ */

+static void
+fetch_source_databases(struct CreateSubscriberOptions *opt)

There is no 'source' terminology in the other function names. It uses
publisher, subscriber, primary and standby. There is also other functions using
'get' so I would use get_publisher_databases as function name.

It is just a matter of style but since the columns you are using are booleans,
it sounds natural to not specify the equality operator. You should also test
datconnlimit to avoid invalid databases. I would add ORDER BY for
predictability.

+   res = PQexec(conn, "SELECT datname FROM pg_database WHERE datistemplate = false AND datallowconn = true");

Something like:

SELECT datname FROM pg_database WHERE datistemplate AND datallowconn AND datconnlimit <> -2 ORDER BY 1

What happens if you don't specify the dbname in -P option?

+   /* Establish a connection to the source server */
+   conn = connect_database(opt->pub_conninfo_str, true);

It defaults to user name. If you are using another superuser (such as admin)
the connection might fail if there is no database named 'admin'. vacuumdb that
has a similar --all option, uses another option (--maintenance-db) to discover
which databases should be vacuumed. I'm not suggesting to add the
--maintenance-db option. I wouldn't like to hardcode a specific database
(template1, for example) if there is no dbname in the connection string.
Instead, suggest the user to specify dbname into -P option. An error message
should be provided saying the exact reason: no dbname was specified.

I don't think both comments add anything. You already explained before the
function body.

+   /* Process the query result */
+   for (int i = 0; i < PQntuples(res); i++)
+   {
+       const char *dbname = PQgetvalue(res, i, 0);
+
+       simple_string_list_append(&opt->database_names, dbname);
+
+       /* Increment num_dbs to reflect multiple --database options */
+       num_dbs++;
+   }

I wouldn't add an error here.

+   /* Error if no databases were found on the source server */
+   if (num_dbs == 0)
+   {
+       pg_log_error("no suitable databases found on the source server");
+       pg_log_error_hint("Ensure that there are non-template and connectable databases on the source server.");
+       PQclear(res);
+       disconnect_database(conn, true);
+   }

It already has a central place to provide no-database-provided errors. Use it.

    if (opt.database_names.head == NULL)
    {
         ...
    }

The following code is not accurate. If I specify --all, --database and
--subscription, it will report only --database. The user will remove it and run
again. At this time, --subscription will be report. My expectation is to have
all reports at once.

+   /* Validate that --all is not used with incompatible options */
+   if (opt.all_dbs)
+   {
+       char       *bad_switch = NULL;
+
+       if (num_dbs > 0)
+           bad_switch = "--database";
+       else if (num_pubs > 0)
+           bad_switch = "--publication";
+       else if (num_replslots > 0)
+           bad_switch = "--replication-slot";
+       else if (num_subs > 0)
+           bad_switch = "--subscription";
+
+       if (bad_switch)
+       {
+           pg_log_error("%s cannot be used with --all", bad_switch);
+           pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+           exit(1);
+       }
+   }

According to my suggestion above, There won't be 'else if' and the new function
has an independent condition flow.

-   if (opt.database_names.head == NULL)
+   /*
+    * Fetch all databases from the source (publisher) if --all is specified.
+    */
+   if (opt.all_dbs)
+       fetch_source_databases(&opt);
+
+   else if (opt.database_names.head == NULL)

I checked the applications that provide multiple synopsis using the following command.

grep '<cmdsynopsis' doc/src/sgml/ref/*.sgml | awk '{print $1}' | sort | uniq -c

There are just 3 applications that have multiple cmdsynopsis. pgbench has 2
distinct tasks (load and run) so it makes sense to have 2 synopsis. pg_ctl has
multiple tasks and also deserves multiple synopsis. The complexity introduced
into vacuumdb (per table, per schema) seems to justify multiple synopsis too.
However, the same logic doesn't apply here. IMO 0002 shouldn't be applied
because the additional option (--all) doesn't justify multiple synopsis for
syntax clarification.

I have the same opinion as Amit. I wouldn't apply 0003.


--
Euler Taveira

On Wed, 26 Mar 2025 at 02:05, Euler Taveira <euler@eulerto.com> wrote:
>
> On Tue, Mar 25, 2025, at 8:07 AM, Shubham Khanna wrote:
>
> The following code is not accurate. If I specify --all, --database and
> --subscription, it will report only --database. The user will remove it and run
> again. At this time, --subscription will be report. My expectation is to have
> all reports at once.

I felt pg_dump reports conflicts incrementally rather than listing all
incompatible options at once as shown below:
### Specified, --data-only, --schema-only, --statistics-only,
--no-data and --clean options. together
Example1:  ./pg_dump --data-only --schema-only --statistics-only
--no-data --clean -d postgres -f dump.txt
pg_dump: error: options -s/--schema-only and -a/--data-only cannot be
used together

### After removing --schema-only option, another error arises
Example2:  ./pg_dump --data-only  --statistics-only --no-data --clean
-d postgres -f dump.txt
pg_dump: error: options -a/--data-only and --statistics-only cannot be
used together

### After removing --no-data option, another error arises
Example3:  ./pg_dump --data-only   --no-data --clean -d postgres -f dump.txt
pg_dump: error: options -a/--data-only and --no-data cannot be used together

### After removing --clean option,  another error arises
Example4: ./pg_dump --data-only --clean -d postgres -f dump.txt
pg_dump: error: options -c/--clean and -a/--data-only cannot be used together

I believe the current patch follows this approach and also simplifies
the code handling, making it easier to maintain.

Regards,
Vignesh



On Wed, Mar 26, 2025 at 2:05 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Tue, Mar 25, 2025, at 8:07 AM, Shubham Khanna wrote:
>
> Apologies for the oversight. I have attached the patches now. Please
> find them included here.
>
>
> I started looking at this patch. When I started reading the commit message, I
> didn't understand why the options that provide names to objects are
> incompatible with --all option. I agree that --all and --database should be
> incompatible but the others shouldn't. We already have a check saying that the
> number of specified objects cannot be different from the number of databases.
> Why don't rely on it instead of forbidding these options?
>

We must ensure to match the order of objects specified with the
database as well (The doc says: The order of the multiple publication
name switches must match the order of database switches.). It is easy
for users to do that when explicitly specifying databases with -d
switches but not with --all case. I can see a way to extend the
current functionality to allow just fetching databases from the
publisher and displaying them to the user, then we can expect the user
to specify the names of other objects in the same order but not
otherwise.

>
> What happens if you don't specify the dbname in -P option?
>
> +   /* Establish a connection to the source server */
> +   conn = connect_database(opt->pub_conninfo_str, true);
>
> It defaults to user name. If you are using another superuser (such as admin)
> the connection might fail if there is no database named 'admin'. vacuumdb that
> has a similar --all option, uses another option (--maintenance-db) to discover
> which databases should be vacuumed. I'm not suggesting to add the
> --maintenance-db option. I wouldn't like to hardcode a specific database
> (template1, for example) if there is no dbname in the connection string.
> Instead, suggest the user to specify dbname into -P option. An error message
> should be provided saying the exact reason: no dbname was specified.
>

But why shouldn't we use the same specification as vacuumdb, which is:
"If not specified, the postgres database will be used, or if that does
not exist, template1 will be used."?

>
> I checked the applications that provide multiple synopsis using the following command.
>
> grep '<cmdsynopsis' doc/src/sgml/ref/*.sgml | awk '{print $1}' | sort | uniq -c
>
> There are just 3 applications that have multiple cmdsynopsis. pgbench has 2
> distinct tasks (load and run) so it makes sense to have 2 synopsis. pg_ctl has
> multiple tasks and also deserves multiple synopsis. The complexity introduced
> into vacuumdb (per table, per schema) seems to justify multiple synopsis too.
> However, the same logic doesn't apply here. IMO 0002 shouldn't be applied
> because the additional option (--all) doesn't justify multiple synopsis for
> syntax clarification.
>

Yeah, also, vacuumdb doesn't have a separate line for --all in
synopsis. So, I agree with you about not adding '--all' option in the
synopsis.

--
With Regards,
Amit Kapila.



On Tue, Mar 25, 2025 at 5:30 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Mar 20, 2025 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > *  <refsynopsisdiv>
> > +  <cmdsynopsis>
> > +   <command>pg_createsubscriber</command>
> > +   <arg rep="repeat"><replaceable>option</replaceable></arg>
> > +   <group choice="plain">
> > +    <group choice="req">
> > +     <arg choice="plain"><option>-a</option></arg>
> > +     <arg choice="plain"><option>--all</option></arg>
> > +    </group>
> > +    <group choice="req">
> > +     <arg choice="plain"><option>-D</option> </arg>
> > +     <arg choice="plain"><option>--pgdata</option></arg>
> > +    </group>
> > +    <replaceable>datadir</replaceable>
> > +    <group choice="req">
> > +     <arg choice="plain"><option>-P</option></arg>
> > +     <arg choice="plain"><option>--publisher-server</option></arg>
> > +    </group>
> > +    <replaceable>connstr</replaceable>
> >
> > Most of this is unrelated to this patch. I suggest making a top-up
> > patch, we can commit it separately.
>
> Isn't this documenting -a/--all option in synopsis. Why do you think
> it's unrelated?
>

I was not clear at that stage whether to include it along with the
core patch, but today, I looked at it while responding to Euler and
found that it is not required. We can still discuss whether to include
it, but the main patch can be committed even without this.

--
With Regards,
Amit Kapila.



On Tue, Mar 25, 2025 at 5:24 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Mar 25, 2025 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Mar 25, 2025 at 5:08 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > This looks mostly ready except the test changes. I believe when
> > > committing, we are going to squash all three into a single commit. Is
> > > that correct?
> > >
> >
> > I would not prefer to commit 0003 as it is primarily because of test
> > +# run pg_createsubscriber with '--all' option without '--dry-run'. I
> > am not so sure whether it is really worth adding test cycles for this
> > option except for dry-run mode tests but we can discuss after
> > committing the core patch.
>
> I am worried that without that test, we won't catch bugs in creating
> slots, publications and subscriptions and thus causing problems in a
> setup with --all.
>

The -all option internally maps to -d switches, so the current tests
with that option should suffice for the need you are expecting.

--
With Regards,
Amit Kapila.



On Wed, Mar 26, 2025 at 10:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 26, 2025 at 2:05 AM Euler Taveira <euler@eulerto.com> wrote:
> >
> > On Tue, Mar 25, 2025, at 8:07 AM, Shubham Khanna wrote:
> >
> > Apologies for the oversight. I have attached the patches now. Please
> > find them included here.
> >
> >
> > I started looking at this patch. When I started reading the commit message, I
> > didn't understand why the options that provide names to objects are
> > incompatible with --all option. I agree that --all and --database should be
> > incompatible but the others shouldn't. We already have a check saying that the
> > number of specified objects cannot be different from the number of databases.
> > Why don't rely on it instead of forbidding these options?
> >
>
> We must ensure to match the order of objects specified with the
> database as well (The doc says: The order of the multiple publication
> name switches must match the order of database switches.). It is easy
> for users to do that when explicitly specifying databases with -d
> switches but not with --all case. I can see a way to extend the
> current functionality to allow just fetching databases from the
> publisher and displaying them to the user, then we can expect the user
> to specify the names of other objects in the same order but not
> otherwise.
>
> >
> > What happens if you don't specify the dbname in -P option?
> >
> > +   /* Establish a connection to the source server */
> > +   conn = connect_database(opt->pub_conninfo_str, true);
> >
> > It defaults to user name. If you are using another superuser (such as admin)
> > the connection might fail if there is no database named 'admin'. vacuumdb that
> > has a similar --all option, uses another option (--maintenance-db) to discover
> > which databases should be vacuumed. I'm not suggesting to add the
> > --maintenance-db option. I wouldn't like to hardcode a specific database
> > (template1, for example) if there is no dbname in the connection string.
> > Instead, suggest the user to specify dbname into -P option. An error message
> > should be provided saying the exact reason: no dbname was specified.
> >
>
> But why shouldn't we use the same specification as vacuumdb, which is:
> "If not specified, the postgres database will be used, or if that does
> not exist, template1 will be used."?
>

I agree that ensuring the correct order of objects (like publications)
matching with databases is crucial, especially when explicitly
specifying databases using -d switches.
To address this, I have created a 0002 patch that aligns object
creation order with the corresponding databases.

> >
> > I checked the applications that provide multiple synopsis using the following command.
> >
> > grep '<cmdsynopsis' doc/src/sgml/ref/*.sgml | awk '{print $1}' | sort | uniq -c
> >
> > There are just 3 applications that have multiple cmdsynopsis. pgbench has 2
> > distinct tasks (load and run) so it makes sense to have 2 synopsis. pg_ctl has
> > multiple tasks and also deserves multiple synopsis. The complexity introduced
> > into vacuumdb (per table, per schema) seems to justify multiple synopsis too.
> > However, the same logic doesn't apply here. IMO 0002 shouldn't be applied
> > because the additional option (--all) doesn't justify multiple synopsis for
> > syntax clarification.
> >
>
> Yeah, also, vacuumdb doesn't have a separate line for --all in
> synopsis. So, I agree with you about not adding '--all' option in the
> synopsis.
>
> --

The attached patches contain the suggested changes. They also address
the comments provided by Ashutosh (at [1]) and Euler (at [2]).

[1] - https://www.postgresql.org/message-id/CAExHW5uvp6LWfgcysohDaOaNhqAbmuc%3D9BwWke%3D6KPRZ%2BwVOkA%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/e32c358b-95b5-426c-9baa-281812821588%40app.fastmail.com

Thanks and regards,
Shubham Khanna.

Attachment
Dear Shubham,

Thanks for updating the patch!

> I agree that ensuring the correct order of objects (like publications)
> matching with databases is crucial, especially when explicitly
> specifying databases using -d switches.
> To address this, I have created a 0002 patch that aligns object
> creation order with the corresponding databases.

ISTM 0002 does completely different stuff. It handles the case when dbname is not
specified in -P. Commit message is also wrong.

Anyway, comments for 0001/0002:

01. connect_database
```
@@ -536,8 +541,9 @@ connect_database(const char *conninfo, bool exit_on_error)
        conn = PQconnectdb(conninfo);
        if (PQstatus(conn) != CONNECTION_OK)
        {
-               pg_log_error("connection to database failed: %s",
-                                        PQerrorMessage(conn));
+               if (exit_on_error)
+                       pg_log_error("connection to database failed: %s",
+                                                PQerrorMessage(conn));
                PQfinish(conn);
```

Any error messages should not be suppressed. I imagine you've added this because
this command may try to connect to the postgres/template1, but this change affects
all other parts. I feel this change is not needed.

02. main()
```
+       if (opt.all_dbs)
+       {
+               bool            dbnamespecified = (dbname_conninfo != NULL);
+
+               get_publisher_databases(&opt, dbnamespecified);
+       }
        if (opt.database_names.head == NULL)
```

Need a blank between them. Ashutosh pointed out [1] because "else-if" was used in v21.
Now it becomes "if" again, the change must be reverted.

03. main()
```
-                * If --database option is not provided, try to obtain the dbname from
-                * the publisher conninfo. If dbname parameter is not available, error
-                * out.
+                * If neither --database nor --all option is provided, try to obtain
+                * the dbname from the publisher conninfo. If dbname parameter is not
+                * available, error out.
```

This comment must be updated because we can reach here even when -a is specified.
Personally, since the pg_log_info() describes the situation, it is enough;

Try to obtain the dbname from the publisher conninfo. If dbname parameter is not
available, error out.

04.
```
+# run pg_createsubscriber with '--all' option
+my ($stdout, $stderr) = run_command(
+       [
+               'pg_createsubscriber',
+               '--verbose',
+               '--dry-run',
+               '--recovery-timeout' => $PostgreSQL::Test::Utils::timeout_default,
+               '--pgdata' => $node_s->data_dir,
+               '--publisher-server' => $node_p->connstr($db1),
+               '--socketdir' => $node_s->host,
+               '--subscriber-port' => $node_s->port,
+               '--all',
+       ],
+       'run pg_createsubscriber with --all');
```

We should test the case when -P does not contain dbname. IIUC, it is enough to use
`node_p->connstr` instead of `node_p->connstr($db1)`.

[1]: https://www.postgresql.org/message-id/CAExHW5uvp6LWfgcysohDaOaNhqAbmuc%3D9BwWke%3D6KPRZ%2BwVOkA%40mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Wed, Mar 26, 2025 at 4:02 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>

Let's combine 0001 and 0002.

A few minor comments:
*
+       If database name is not specified in publisher-server, it will try to
+       connect with postgres/template1 database to fetch the databases from
+       primary.

Can we change the above to: "If the database name is not specified in
publisher-server, the postgres database will be used, or if that does
not exist, template1 will be used."

*
+       Create one subscription on the target server for each non-template
+       database on the source server that allows connections, excluding
+       template databases or databases with connection restrictions.

The following text proposed by Euler seems better: "Create one
subscription per database on the target server. Exceptions are
template databases and databases that don't allow connections."

Please use this one at the above and other places where required.

--
With Regards,
Amit Kapila.



On Thu, Mar 27, 2025 at 12:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 26, 2025 at 4:02 PM Shubham Khanna
> <khannashubham1197@gmail.com> wrote:
> >
>
> Let's combine 0001 and 0002.
>

Combined 0001 and 0002.

> A few minor comments:
> *
> +       If database name is not specified in publisher-server, it will try to
> +       connect with postgres/template1 database to fetch the databases from
> +       primary.
>
> Can we change the above to: "If the database name is not specified in
> publisher-server, the postgres database will be used, or if that does
> not exist, template1 will be used."
>

Fixed.

> *
> +       Create one subscription on the target server for each non-template
> +       database on the source server that allows connections, excluding
> +       template databases or databases with connection restrictions.
>
> The following text proposed by Euler seems better: "Create one
> subscription per database on the target server. Exceptions are
> template databases and databases that don't allow connections."
>
> Please use this one at the above and other places where required.
>

Fixed.

The attached patches contain the suggested changes. They also address
the comments provided by Kuroda-san (at [1]).

[1] -
https://www.postgresql.org/message-id/OSCPR01MB149660B6D00665F47896CA812F5A12%40OSCPR01MB14966.jpnprd01.prod.outlook.com

Thanks and regards,
Shubham Khanna.

Attachment
On Thu, Mar 27, 2025 at 1:35 PM Shubham Khanna
<khannashubham1197@gmail.com> wrote:
>
> The attached patches contain the suggested changes. They also address
> the comments provided by Kuroda-san (at [1]).
>

Pushed after some cosmetic changes.

--
With Regards,
Amit Kapila.