Thread: Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

From
Peter Smith
Date:
Hi Shubham,

Here are some review comments for the patch v1-0001.

Note -- I think Kuroda-san's idea to use a new switch like
'--enable-two-phase' would eliminate lots of my review comments below
about the inconsistencies with CREATE SUBSCRIBER, and the current
confusion about defaults and argument values etc.

======
Commit message

1.
By default, two-phase commit is enabled if the option is provided without
arguments. Users can explicitly set it to 'on' or 'off' using '--two-phase=on'
or '--two-phase=off'.

~

But you don't say what is the default if the option is omitted entirely.


~~~

2.
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two-phase = on)', is supported.
This provides flexibility for future enhancements.

~

Typo in ALTER example with the option name /two-phase/two_phase/

~~~

3.
Documentation has been updated to reflect the new option, and test cases have
been added to validate various scenarios, including proper validation of the
'--two-phase' flag and its combinations with other options.

/flag/option/ ??


======
doc/src/sgml/ref/pg_createsubscriber.sgml

4.
+    <varlistentry>
+     <term><option>-T</option></term>
+     <term><option>--two_phase</option></term>
+     <listitem>
+      <para>
+       Enables or disables two-phase commit for the subscription.
+       When the option is provided without a value, it defaults to
+       <literal>on</literal>. Specify <literal>on</literal> to enable or
+       <literal>off</literal> to disable.
+       Two-phase commit ensures atomicity in logical replication for prepared
+       transactions. By default, this option is enabled unless explicitly set
+       to <literal>off</literal>.
+      </para>
+     </listitem>
+    </varlistentry>
+

4a.
Typo -- the option you made is 'two-phase', not 'two_phase'

~

4b.
When you say "By default, this option is enabled unless explicitly set
to off." I take that as meaning it is default *enabled* even when the
option is entirely omitted.

But, that would be contrary to the behaviour of a normal CREATE
SUBSCRIPTION 'two_phase' option, so I wonder why should
pg_createsubscriber have a different default. Is this intentional? IMO
if no switch is specified then I thought it should be the same as the
CREATE SUBSCRIPTION default (i.e. false).

~

4c.
This "-T" option should be moved to be adjacent (below) "-t" to keep
everything consistently in A-Z order.

======
src/bin/pg_basebackup/pg_createsubscriber.c

5.
  bool made_replslot; /* replication slot was created */
  bool made_publication; /* publication was created */
+ bool two_phase; /* two-phase option was created */

I am not sure what "option was created" even means. Cut/paste error?

Also, perhaps this field belongs with the 1st group of fields in this
struct, instead of with those made_xxx fields.

~~~

store_pub_sub_info:

6.
+ /* Store two-phase option */
+ dbinfo[i].two_phase = opt->two_phase;
+

The comment says the same as what the code is doing which seems
unhelpful/redundant. And if you rearrange the struct fields as
previously suggested, this assignment should also move.

~~~

7.
  pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i,
  dbinfo[i].subname ? dbinfo[i].subname : "(auto)",
  dbinfo[i].subconninfo);
+ pg_log_debug("publisher(%d): two-phase: %s", i,
+ dbinfo[i].two_phase ? "true" : "false");

"two_phase" is a subscription option. So shouldn't this added
pg_log_debug info be part of the preceding pg_log_debug about the
subscription?

~~~

main:

8.
  {"recovery-timeout", required_argument, NULL, 't'},
+ {"two_phase", optional_argument, NULL, 'T'},
  {"subscriber-username", required_argument, NULL, 'U'},

AFAIK this option was supposed to be 'two-phase' (dash instead of
underscore  is consistent with the other pg_createsubscriber options)

~~~

9.
  opt.sub_username = NULL;
+ opt.two_phase = true;

As previously mentioned this default is not the same as the CREATE
SUBSCRIPTION default for 'two_phase', and I find that inconsistency to
be confusing.

~~~

10.
+ case 'T':
+ if (optarg != NULL)
+ {
+ if (strcmp(optarg, "on") == 0)
+ opt.two_phase = true;
+ else if (strcmp(optarg, "off") == 0)
+ opt.two_phase = false;
+ else
+ pg_fatal("invalid value for --two-phase: must be 'on' or 'off'");
+ }
+ else
+ opt.two_phase = true; /* Default to true if no argument
+ * is provided */
+ break;

I wonder if users familiar with the CREATE SUBSCRIPTION 'two_phase'
might find it strange that the only values accepted here are 'on' and
'off' but now all the other boolean variants.

======
.../t/040_pg_createsubscriber.pl

11.
+# Run pg_createsubscriber on a promoted server with two_phase=on
+command_ok(
+ [
+ 'pg_createsubscriber', '--verbose',
+ '--recovery-timeout', "$PostgreSQL::Test::Utils::timeout_default",
+ '--pgdata', $node_b->data_dir,
+ '--publisher-server', $node_a->connstr($db10),
+ '--subscriber-port', $node_b->port,
+ '--database', $db10,
+ '--two_phase=on'
+ ],
+ 'created subscription with two-phase commit enabled');

Hmm. I expect this should have been specified as '--two-phase=on'
(dash instead of underscore for consistency with all other switches of
pg_createsubscriber) but previous typos (e.g. #8 above) have
compounded the confusion.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

From
Shubham Khanna
Date:
On Tue, Dec 10, 2024 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shubham,
>
> Here are some review comments for the patch v1-0001.
>
> Note -- I think Kuroda-san's idea to use a new switch like
> '--enable-two-phase' would eliminate lots of my review comments below
> about the inconsistencies with CREATE SUBSCRIBER, and the current
> confusion about defaults and argument values etc.
>
> ======
> Commit message
>
> 1.
> By default, two-phase commit is enabled if the option is provided without
> arguments. Users can explicitly set it to 'on' or 'off' using '--two-phase=on'
> or '--two-phase=off'.
>
> ~
>
> But you don't say what is the default if the option is omitted entirely.
>
>
> ~~~
>
> 2.
> Notably, the replication for prepared transactions functions regardless of the
> initial two-phase setting on the replication slot. However, the user cannot
> change the setting after the subscription is created unless a future command,
> such as 'ALTER SUBSCRIPTION ... SET (two-phase = on)', is supported.
> This provides flexibility for future enhancements.
>
> ~
>
> Typo in ALTER example with the option name /two-phase/two_phase/
>
> ~~~
>
> 3.
> Documentation has been updated to reflect the new option, and test cases have
> been added to validate various scenarios, including proper validation of the
> '--two-phase' flag and its combinations with other options.
>
> /flag/option/ ??
>

Modified the commit message according to the suggested changes.

> ======
> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 4.
> +    <varlistentry>
> +     <term><option>-T</option></term>
> +     <term><option>--two_phase</option></term>
> +     <listitem>
> +      <para>
> +       Enables or disables two-phase commit for the subscription.
> +       When the option is provided without a value, it defaults to
> +       <literal>on</literal>. Specify <literal>on</literal> to enable or
> +       <literal>off</literal> to disable.
> +       Two-phase commit ensures atomicity in logical replication for prepared
> +       transactions. By default, this option is enabled unless explicitly set
> +       to <literal>off</literal>.
> +      </para>
> +     </listitem>
> +    </varlistentry>
> +
>
> 4a.
> Typo -- the option you made is 'two-phase', not 'two_phase'
>
> ~

Fixed.

> 4b.
> When you say "By default, this option is enabled unless explicitly set
> to off." I take that as meaning it is default *enabled* even when the
> option is entirely omitted.
>
> But, that would be contrary to the behaviour of a normal CREATE
> SUBSCRIPTION 'two_phase' option, so I wonder why should
> pg_createsubscriber have a different default. Is this intentional? IMO
> if no switch is specified then I thought it should be the same as the
> CREATE SUBSCRIPTION default (i.e. false).
>
> ~

Modified the documentation on the basis of the latest changes added to
the patch.

> 4c.
> This "-T" option should be moved to be adjacent (below) "-t" to keep
> everything consistently in A-Z order.
>

Fixed.

> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> 5.
>   bool made_replslot; /* replication slot was created */
>   bool made_publication; /* publication was created */
> + bool two_phase; /* two-phase option was created */
>
> I am not sure what "option was created" even means. Cut/paste error?
>
> Also, perhaps this field belongs with the 1st group of fields in this
> struct, instead of with those made_xxx fields.
>
> ~~~
>
> store_pub_sub_info:
>
> 6.
> + /* Store two-phase option */
> + dbinfo[i].two_phase = opt->two_phase;
> +
>
> The comment says the same as what the code is doing which seems
> unhelpful/redundant. And if you rearrange the struct fields as
> previously suggested, this assignment should also move.
>

Fixed.

>
> 7.
>   pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i,
>   dbinfo[i].subname ? dbinfo[i].subname : "(auto)",
>   dbinfo[i].subconninfo);
> + pg_log_debug("publisher(%d): two-phase: %s", i,
> + dbinfo[i].two_phase ? "true" : "false");
>
> "two_phase" is a subscription option. So shouldn't this added
> pg_log_debug info be part of the preceding pg_log_debug about the
> subscription?
>

Fixed.

>
> main:
>
> 8.
>   {"recovery-timeout", required_argument, NULL, 't'},
> + {"two_phase", optional_argument, NULL, 'T'},
>   {"subscriber-username", required_argument, NULL, 'U'},
>
> AFAIK this option was supposed to be 'two-phase' (dash instead of
> underscore  is consistent with the other pg_createsubscriber options)
>

Fixed.

>
> 9.
>   opt.sub_username = NULL;
> + opt.two_phase = true;
>
> As previously mentioned this default is not the same as the CREATE
> SUBSCRIPTION default for 'two_phase', and I find that inconsistency to
> be confusing.
>

Changed the default option to false as suggested.

>
> 10.
> + case 'T':
> + if (optarg != NULL)
> + {
> + if (strcmp(optarg, "on") == 0)
> + opt.two_phase = true;
> + else if (strcmp(optarg, "off") == 0)
> + opt.two_phase = false;
> + else
> + pg_fatal("invalid value for --two-phase: must be 'on' or 'off'");
> + }
> + else
> + opt.two_phase = true; /* Default to true if no argument
> + * is provided */
> + break;
>
> I wonder if users familiar with the CREATE SUBSCRIPTION 'two_phase'
> might find it strange that the only values accepted here are 'on' and
> 'off' but now all the other boolean variants.
>

Fixed.

> ======
> .../t/040_pg_createsubscriber.pl
>
> 11.
> +# Run pg_createsubscriber on a promoted server with two_phase=on
> +command_ok(
> + [
> + 'pg_createsubscriber', '--verbose',
> + '--recovery-timeout', "$PostgreSQL::Test::Utils::timeout_default",
> + '--pgdata', $node_b->data_dir,
> + '--publisher-server', $node_a->connstr($db10),
> + '--subscriber-port', $node_b->port,
> + '--database', $db10,
> + '--two_phase=on'
> + ],
> + 'created subscription with two-phase commit enabled');
>
> Hmm. I expect this should have been specified as '--two-phase=on'
> (dash instead of underscore for consistency with all other switches of
> pg_createsubscriber) but previous typos (e.g. #8 above) have
> compounded the confusion.
>

Fixed.

The v2 version patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CAHv8RjLcdmz%3D_RMwveuDdr8i7r%3D09TAwtOnFmXeaia_v2RmnYA%40mail.gmail.com

Thanks and Regards,
Shubham Khanna.