Re: Slow catchup of 2PC (twophase) transactions on replica in LR - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Date
Msg-id CAHut+PuOFZvynF9YQrJA0zZnd5MB4Xh43WXZzh0nu5QGML4v7Q@mail.gmail.com
Whole thread Raw
In response to RE: Slow catchup of 2PC (twophase) transactions on replica in LR  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses RE: Slow catchup of 2PC (twophase) transactions on replica in LR
List pgsql-hackers
Hi, Here are some review comments for v8-0002.

======
Commit message

1.
Regarding the off->on case, the logical replication already has a mechanism for
it, so there is no need to do anything special for the on->off case; however,
we must connect to the publisher and expressly change the parameter. The
operation cannot be rolled back, and altering the parameter from "on" to "off"
within a transaction is prohibited.

~

I think the difference between "off"-to"on" and "on"-to"off" needs to
be explained in more detail. Specifically "already has a mechanism for
it" seems very vague.

======
src/backend/commands/subscriptioncmds.c

2.
  /*
- * The changed two_phase option of the slot can't be rolled
- * back.
+ * Since the altering two_phase option of subscriptions
+ * also leads to the change of slot option, this command
+ * cannot be rolled back. So prevent we are in the
+ * transaction block.
  */
- PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
(two_phase)");
+ if (!opts.twophase)
+ PreventInTransactionBlock(isTopLevel,
+

2a.
There is a typo: "So prevent we are".

SUGGESTION (minor adjustment and typo fix)
Since altering the two_phase option of subscriptions also leads to
changing the slot option, this command cannot be rolled back. So
prevent this if we are in a transaction block.

~

2b.
But, in my previous review [v7-0002#3] I asked if the comment could
explain why this check is only needed for two_phase "on"-to-"off" but
not for "off"-to-"on". That explanation/reason is still not yet given
in the latest comment.

~~~

3.
  /*
- * Try to acquire the connection necessary for altering slot.
+ * Check the need to alter the replication slot. Failover and two_phase
+ * options are controlled by both the publisher (as a slot option) and the
+ * subscriber (as a subscription option).
+ */
+ update_failover = replaces[Anum_pg_subscription_subfailover - 1];
+ update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
+ !opts.twophase);


(This is similar to the previous comment). In my previous review
[v7-0002#3a] I asked why update_two_phase is TRUE only if 'two-phase'
is being updated "on"-to-"off", but not when it is being updated
"off"-to-"on". That explanation/reason is still not yet given in the
latest comment.

======
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c

4.
- appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s,
TWO_PHASE %s )",
- quote_identifier(slotname),
- failover ? "true" : "false",
- two_phase ? "true" : "false");
+ appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
+ quote_identifier(slotname));
+
+ if (failover)
+ appendStringInfo(&cmd, "FAILOVER %s",
+ *failover ? "true" : "false");
+
+ if (failover && two_phase)
+ appendStringInfo(&cmd, ", ");
+
+ if (two_phase)
+ appendStringInfo(&cmd, "TWO_PHASE %s",
+ *two_phase ? "true" : "false");
+
+ appendStringInfoString(&cmd, ");");

It will be better if that last line includes the extra space like I
had suggested in [v7-0002#4a] so the result will have the same spacing
as in the original code. e.g.

+ appendStringInfoString(&cmd, " );");

======
src/test/subscription/t/099_twophase_added.pl

5.
+# Check the case that prepared transactions exist on the publisher node.
+#
+# Since the two_phase is "off", then normally, this PREPARE will do nothing
+# until the COMMIT PREPARED, but in this test, we toggle the two_phase to "on"
+# again before the COMMIT PREPARED happens.

This is a major test part so IMO this comment should have
##################### like it had before, to distinguish it from all
the sub-step comments.

======

My v7-0002 review -
https://www.postgresql.org/message-id/CAHut%2BPtu_w_UCGR-5DbenA%2By7wRiA8QPi_ZP%3DCCJ3SGdTn-%3D%3Dg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Convert sepgsql tests to TAP
Next
From: Peter Smith
Date:
Subject: Re: Slow catchup of 2PC (twophase) transactions on replica in LR