Thread: Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.
On Mon, 9 Dec 2024 at 16:25, Shubham Khanna <khannashubham1197@gmail.com> wrote: > > Hi all, > > I am writing to propose the addition of the two_phase option in > pg_createsubscriber. As discussed in [1], supporting this feature > during the development of pg_createsubscriber was planned for this > version. Yes this will be useful. > Currently, pg_createsubscriber creates subscriptions with the > two_phase option set to false. Enabling the two_phase option after a > subscription has been created is not straightforward, as it requires > the subscription to be disabled first. This patch aims to address this > limitation by incorporating the two_phase option into > pg_createsubscriber which will help create subscription with two_phase > option and make use of the advantages of creating subscription with > two_phase option. > The attached patch has the changes for the same. Few comments: 1) You can keep it with no argument similar to how dry-run is handled: @@ -1872,6 +1881,7 @@ main(int argc, char **argv) {"publisher-server", required_argument, NULL, 'P'}, {"socketdir", required_argument, NULL, 's'}, {"recovery-timeout", required_argument, NULL, 't'}, + {"two_phase", optional_argument, NULL, 'T'}, {"subscriber-username", required_argument, NULL, 'U'}, {"verbose", no_argument, NULL, 'v'}, {"version", no_argument, NULL, 'V'}, 2) After making it to no argument option, if this option is set, just set the value, strcmp's are not required: + case 'T': + if (optarg != NULL) + { + if (strcmp(optarg, "on") == 0) + opt.two_phase = true; + else if (strcmp(optarg, "off") == 0) + opt.two_phase = false; + else + pg_fatal("invalid value for --two-phase: must be 'on' or 'off'"); + } + else + opt.two_phase = true; /* Default to true if no argument + * is provided */ + break; 3) You can add a link to create subscription documentation page of two_phase option here: + Enables or disables two-phase commit for the subscription. + When the option is provided without a value, it defaults to + <literal>on</literal>. Specify <literal>on</literal> to enable or + <literal>off</literal> to disable. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. By default, this option is enabled unless explicitly set + to <literal>off</literal>. 4) Instead of adding new tests, can we include the prepare transaction and prepare transaction verification to the existing tests itself? +# Set up node A as primary +my $node_a = PostgreSQL::Test::Cluster->new('node_a'); +my $aconnstr = $node_a->connstr; +$node_a->init(allows_streaming => 'logical'); +$node_a->append_conf( + 'postgresql.conf', qq[ + autovacuum = off + wal_level = logical + max_wal_senders = 10 + max_worker_processes = 8 + max_prepared_transactions = 10 +]); + +$node_a->start; Regards, Vignesh
On Tue, 10 Dec 2024 at 17:17, Shubham Khanna <khannashubham1197@gmail.com> wrote: > > On Mon, Dec 9, 2024 at 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > The attached Patch contains the required changes. Few comments: 1) This is not correct, currently enabling two_phase option using alter subscription is supported: Notably, the replication for prepared transactions functions regardless of the initial two-phase setting on the replication slot. However, the user cannot change the setting after the subscription is created unless a future command, such as 'ALTER SUBSCRIPTION ... SET (two_phase = on)', is supported. This provides flexibility for future enhancements. 2) You can enable-two-phase on a non dry-run mode test case, as the verification will not be possible in dry-run mode: # pg_createsubscriber can run without --databases option @@ -352,7 +355,7 @@ command_ok( $node_p->connstr($db1), '--socketdir', $node_s->host, '--subscriber-port', $node_s->port, '--replication-slot', - 'replslot1' + 'replslot1', '--enable-two-phase' 3) I felt this line can be removed "Two-phase commit ensures atomicity in logical replication for prepared transactions." as this information is available at prepare transaction and create subscription page documentation: + <literal>off</literal>. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. See the + <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> + documentation for more details. 4) This change is not required as it is not needed for the patch: dbinfo[i].made_replslot = false; dbinfo[i].made_publication = false; + /* Fill subscriber attributes */ conninfo = concat_conninfo_dbname(sub_base_conninfo, cell->val); 5) Similarly here too: @@ -341,6 +343,7 @@ command_ok( $node_s->start; is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'), 't', 'standby is in recovery'); + $node_s->stop; Regards, Vignesh
On Wed, Dec 11, 2024 at 10:59 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 10 Dec 2024 at 17:17, Shubham Khanna > <khannashubham1197@gmail.com> wrote: > > > > On Mon, Dec 9, 2024 at 7:43 PM vignesh C <vignesh21@gmail.com> wrote: > > > > The attached Patch contains the required changes. > > Few comments: > 1) This is not correct, currently enabling two_phase option using > alter subscription is supported: > Notably, the replication for prepared transactions functions regardless of the > initial two-phase setting on the replication slot. However, the user cannot > change the setting after the subscription is created unless a future command, > such as 'ALTER SUBSCRIPTION ... SET (two_phase = on)', is supported. > This provides flexibility for future enhancements. > > 2) You can enable-two-phase on a non dry-run mode test case, as the > verification will not be possible in dry-run mode: > # pg_createsubscriber can run without --databases option > @@ -352,7 +355,7 @@ command_ok( > $node_p->connstr($db1), '--socketdir', > $node_s->host, '--subscriber-port', > $node_s->port, '--replication-slot', > - 'replslot1' > + 'replslot1', '--enable-two-phase' > > 3) I felt this line can be removed "Two-phase commit ensures atomicity > in logical replication for prepared transactions." as this information > is available at prepare transaction and create subscription page > documentation: > + <literal>off</literal>. > + Two-phase commit ensures atomicity in logical replication for prepared > + transactions. See the > + <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> > + documentation for more details. > > 4) This change is not required as it is not needed for the patch: > dbinfo[i].made_replslot = false; > dbinfo[i].made_publication = false; > + > /* Fill subscriber attributes */ > conninfo = concat_conninfo_dbname(sub_base_conninfo, cell->val); > > 5) Similarly here too: > @@ -341,6 +343,7 @@ command_ok( > $node_s->start; > is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'), > 't', 'standby is in recovery'); > + > $node_s->stop; > I have fixed the given comments. The v3 version patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CAHv8RjJ8NuHLQUhkqE-fy5k%2B3nZUdX9QngrLaa7iS0TEJNicow%40mail.gmail.com Thanks and Regards, Shubham Khanna.
Hi Shubham, Here are some review comments for the patch v4-0001. ====== GENERAL. 1. After reading Vignesh's last review and then the pg_createsubscriber documentation I see there can be multiple databases simultaneously specified (by writing multiple -d switches) and in that case each one gets its own: --publication --replication-slot --subscription OTOH, this new '--enable-two-phase' switch is just a blanket two_phase enablement across all subscriptions. There is no way for the user to say if they want it enabled for some subscriptions (on some databases) but not for others. I suppose this was intentional and OK (right?), but this point needs to be clarified in the docs. ====== doc/src/sgml/ref/pg_createsubscriber.sgml 2. + <para> + Enables <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> + commit for the subscription. The default is <literal>false</literal>. + </para> Following on from the previous review comment. Since this switch is applied to all database subscriptions I think the help needs to say that. Something like. "If there are multiple subscriptions specified, this option applies to all of them." ~~~ 3. In the "Prerequisites" sections of the docs, it gives requirements for various GUC settings on the source server and the target server. So, should there be another note here advising to configure the 'max_prepared_transactions' GUC when the '--enable-two-phase' is specified? ~~~ 4. "Warnings" section includes the following: pg_createsubscriber sets up logical replication with two-phase commit disabled. This means that any prepared transactions will be replicated at the time of COMMIT PREPARED, without advance preparation. Once setup is complete, you can manually drop and re-create the subscription(s) with the two_phase option enabled. ~ The above text is from the "Warnings" section, but it seems stale information that needs to be updated due to the introduction of this new '--enable-two-phase' option. ====== src/bin/pg_basebackup/pg_createsubscriber.c usage: 5. printf(_(" -t, --recovery-timeout=SECS seconds to wait for recovery to end\n")); + printf(_(" -T, --enable-two-phase enable two-phase commit for the subscription\n")); Given the previous review comments (#1/#2 etc), I was wondering if it might be better to say more like "enable two-phase commit for all subscriptions". ====== .../t/040_pg_createsubscriber.pl 6. +is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber'); Should there also have been an earlier check (way back before the PREPARE) just to make sure this count was initially 0? ====== Kind Regards, Peter Smith. Fujitsu Australia
RE: Adding a '--two-phase' option to 'pg_createsubscriber' utility.
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Shubham, > Thank you for pointing this out and for suggesting the changes. I > agree with your approach. > Also, I found a mistake in getopt_long and fixed it in this version of > the patch. > The attached patch contains the suggested changes. Thanks for updating the patch. I think the patch looks mostly OK. Regarding the test code - I think we should directly refer the pg_subscription catalog, and confirm that subtwophase is 'p'. IIUC, we can wait until all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1]. [1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); Best regards, Hayato Kuroda FUJITSU LIMITED
On Thu, 12 Dec 2024 at 08:14, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Shubham, > > > Thank you for pointing this out and for suggesting the changes. I > > agree with your approach. > > Also, I found a mistake in getopt_long and fixed it in this version of > > the patch. > > The attached patch contains the suggested changes. > > Thanks for updating the patch. I think the patch looks mostly OK. > > Regarding the test code - I think we should directly refer the pg_subscription catalog, > and confirm that subtwophase is 'p'. IIUC, we can wait until > all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1]. > > [1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); Yes, that approach is better. Also this is not required after checking using pg_subscription: +# Prepare a transaction on the publisher +$node_p->safe_psql( + $db1, qq[ + BEGIN; + INSERT INTO tbl1 SELECT generate_series(1, 10); + PREPARE TRANSACTION 'test_prepare'; +]); + # Start subscriber $node_s->start; +# Verify that the prepared transaction is replicated to the subscriber +my $count_prepared_s = + $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;"); + +is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber'); Regards, Vignesh
On Thu, Dec 12, 2024 at 8:14 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Shubham, > > > Thank you for pointing this out and for suggesting the changes. I > > agree with your approach. > > Also, I found a mistake in getopt_long and fixed it in this version of > > the patch. > > The attached patch contains the suggested changes. > > Thanks for updating the patch. I think the patch looks mostly OK. > > Regarding the test code - I think we should directly refer the pg_subscription catalog, > and confirm that subtwophase is 'p'. IIUC, we can wait until > all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1]. > > [1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); > I have fixed the given comment. The v5 version patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CAHv8Rj%2Bhd2MTNRs4AsK6%3DhRqvV6JC9g2tTAJwGjrNfXg6vhD8g%40mail.gmail.com Thanks and Regards, Shubham Khanna.
On Thu, Dec 12, 2024 at 9:34 AM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, 12 Dec 2024 at 08:14, Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Shubham, > > > > > Thank you for pointing this out and for suggesting the changes. I > > > agree with your approach. > > > Also, I found a mistake in getopt_long and fixed it in this version of > > > the patch. > > > The attached patch contains the suggested changes. > > > > Thanks for updating the patch. I think the patch looks mostly OK. > > > > Regarding the test code - I think we should directly refer the pg_subscription catalog, > > and confirm that subtwophase is 'p'. IIUC, we can wait until > > all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1]. > > > > [1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); > > Yes, that approach is better. Also this is not required after checking > using pg_subscription: > +# Prepare a transaction on the publisher > +$node_p->safe_psql( > + $db1, qq[ > + BEGIN; > + INSERT INTO tbl1 SELECT generate_series(1, 10); > + PREPARE TRANSACTION 'test_prepare'; > +]); > + > # Start subscriber > $node_s->start; > > +# Verify that the prepared transaction is replicated to the subscriber > +my $count_prepared_s = > + $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;"); > + > +is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber'); > I have fixed the given comment. The v5 version patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CAHv8Rj%2Bhd2MTNRs4AsK6%3DhRqvV6JC9g2tTAJwGjrNfXg6vhD8g%40mail.gmail.com Thanks and Regards, Shubham Khanna.
Hi Shubham, Here are my review comments for the patch v5-0001. ====== doc/src/sgml/ref/pg_createsubscriber.sgml 1. - must accept local connections. + must accept local connections. If you are planning to use the + --enable-two-phase switch then you will also need to set the + <xref linkend="guc-max-prepared-transactions"/> appropriately. Should use sgml <option> markup for "--enable-two-phase". ~~~ 2. - <application>pg_createsubscriber</application> sets up logical - replication with two-phase commit disabled. This means that any - prepared transactions will be replicated at the time - of <command>COMMIT PREPARED</command>, without advance preparation. - Once setup is complete, you can manually drop and re-create the - subscription(s) with + If --enable-two-phase switch is not specified, the + <application>pg_createsubscriber</application> sets up logical replication + with two-phase commit disabled. This means that any prepared transactions + will be replicated at the time of <command>COMMIT PREPARED</command>, + without advance preparation. Once setup is complete, you can manually drop + and re-create the subscription(s) with Should use sgml <option> markup for "--enable-two-phase". ====== .../t/040_pg_createsubscriber.pl 3. +# Verify that the subtwophase is 'p' in the pg_subscription catalog +my $poll_query_until = $node_s->safe_psql('postgres', + "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');" +); + +is($poll_query_until, qq(t), + 'Timed out while waiting for subscriber to enable twophase'); 3a. Hmm. Does this code match the comment? The comment says verify subtwophase is 'p' (aka "pending") but AFAICT the code is actually waiting until every subtwophase is 'e' (aka "enabled"). ~ 3b. Also, if you are going to name char-codes (like 'p') in comments, it might be helpful to include the equivalent words, saving readers from having to search the documentation to find the meaning. ====== Kind Regards, Peter Smith. Fujitsu Australia
Hi Shubham. Here are my review comments for v6-0001. ====== 1. +# Verify that the subtwophase is enabled ('e') in the pg_subscription catalog +$node_s->poll_query_until('postgres', + "SELECT count(1) = 2 FROM pg_subscription WHERE subtwophasestate = 'e';") + or die "Timed out while waiting for subscriber to enable twophase"; + This form of the SQL is probably OK but it's a bit tricky; Probably it should have been explained in the comment about where that count "2" has come from. ~~ I think it was OK as before (v5) to be querying until nothing was NOT 'e'. In other words, until everything was enabled 'e'. SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); ~~ OTOH, to save execution time we really would be satisfied with both 'p' and 'e' states here. (we don't strictly need to wait for the transition from 'p' to 'e' to occur). So, SQL like the one below might be the best: # Verify that all subtwophase states are pending or enabled, # e.g. there are no subscriptions where subtwophase is disabled ('d'). SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate IN ('d') ====== Kind Regards, Peter Smith. Fujitsu Australia
On Fri, 13 Dec 2024 at 13:01, Shubham Khanna <khannashubham1197@gmail.com> wrote: > > On Fri, Dec 13, 2024 at 12:20 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Shubham. > > > > Here are my review comments for v6-0001. > > > > ====== > > 1. > > +# Verify that the subtwophase is enabled ('e') in the pg_subscription catalog > > +$node_s->poll_query_until('postgres', > > + "SELECT count(1) = 2 FROM pg_subscription WHERE subtwophasestate = 'e';") > > + or die "Timed out while waiting for subscriber to enable twophase"; > > + > > > > This form of the SQL is probably OK but it's a bit tricky; Probably it > > should have been explained in the comment about where that count "2" > > has come from. > > > > ~~ > > > > I think it was OK as before (v5) to be querying until nothing was NOT > > 'e'. In other words, until everything was enabled 'e'. > > SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e'); > > > > ~~ > > > > OTOH, to save execution time we really would be satisfied with both > > 'p' and 'e' states here. (we don't strictly need to wait for the > > transition from 'p' to 'e' to occur). > > > > So, SQL like the one below might be the best: > > > > # Verify that all subtwophase states are pending or enabled, > > # e.g. there are no subscriptions where subtwophase is disabled ('d'). > > SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate IN ('d') > > > > I have fixed the given comment. Since prepared transactions are not > being used anymore, I have removed it from the test file. > The attached patch contains the suggested changes. Few comments: 1) Since we are providing --enable-two-phase option now and user can create subscriptions with two-phase option this warning can be removed now: <para> - <application>pg_createsubscriber</application> sets up logical - replication with two-phase commit disabled. This means that any - prepared transactions will be replicated at the time - of <command>COMMIT PREPARED</command>, without advance preparation. - Once setup is complete, you can manually drop and re-create the - subscription(s) with + If <option>--enable-two-phase</option> is not specified, the + <application>pg_createsubscriber</application> sets up logical replication + with two-phase commit disabled. This means that any prepared transactions + will be replicated at the time of <command>COMMIT PREPARED</command>, + without advance preparation. Once setup is complete, you can manually drop + and re-create the subscription(s) with the <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> - option enabled. 2) Since we are not going to wait till the subscriptions are enabled, we can use safe_psql instead of poll_query_until, something like: is($node_s->safe_psql('postgres', "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate = 'd'"), 't', 'subscriptions are created with the two-phase option enabled'); instead of: +$node_s->poll_query_until('postgres', + "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate IN ('d');" +); 3) This can be removed from the commit message: Documentation has been updated to reflect the new option, and test cases have been added to validate various scenarios, including proper validation of the '--enable-two-phase' option and its combinations with other options. 4) Should" two-phase option" be "enable-two-phase option" here: const char *sub_username; /* subscriber username */ + bool two_phase; /* two-phase option */ SimpleStringList database_names; /* list of database names */ Regards, Vignesh
Hi Shubham. The patch v8-0001 looks good to me. FYI, there are a few pending patches [1][2][3] that might have some tests/docs overlap with this one, so be on the lookout because if those get pushed first your patch may require rebasing. ====== [1] https://www.postgresql.org/message-id/flat/CAHut%2BPsVMbDEkbGZxpb1vVPB8uKmhFgD3ow6PXoSs5ktzBXLbQ%40mail.gmail.com#df7b92b25077554439d9bb931103dd8a [2] https://www.postgresql.org/message-id/flat/CAHut%2BPtnA4DB_pcv4TDr4NjUSM1%3DP2N_cuZx5DX09k7LVmaqUA%40mail.gmail.com [3] https://www.postgresql.org/message-id/flat/CAHut%2BPv%2B96CykSY6-uLKZWaa6to9x1DurmyJh8Jmu1_1P7hp4Q%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia