RE: Slow catchup of 2PC (twophase) transactions on replica in LR - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Slow catchup of 2PC (twophase) transactions on replica in LR
Date
Msg-id OSBPR01MB25522052F9F3E3AAD3BA2A8CF5ED2@OSBPR01MB2552.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Slow catchup of 2PC (twophase) transactions on replica in LR  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Slow catchup of 2PC (twophase) transactions on replica in LR
List pgsql-hackers
Dear Peter,

Thanks for reviewing! Here are new patches.

 > 
> //////////
> patch v10-0002
> //////////
> 
> ======
> Commit message
> 
> 2.1.
> Regarding the false->true case, the backend process alters the subtwophase to
> LOGICALREP_TWOPHASE_STATE_PENDING once. After the subscription is
> enabled, a new
> logical replication worker requests to change the two_phase option of its slot
> from pending to true after the initial data synchronization is done. The code
> path is the same as the case in which two_phase is initially set to true, so
> there is no need to do something remarkable. However, for the true->false case,
> the backend must connect to the publisher and expressly change the parameter
> because the apply worker does not alter the option to false. The
> operation cannot
> be rolled back, and altering the parameter from "true" to "false" within a
> transaction is prohibited.
> 
> ~
> 
> BEFORE
> The operation cannot be rolled back, and altering the parameter from
> "true" to "false" within a transaction is prohibited.
> 
> SUGGESTION
> Because this operation cannot be rolled back, altering the two_phase
> parameter from "true" to "false" within a transaction is prohibited.

Fixed.

> 
> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.2.
>     <command>ALTER SUBSCRIPTION ... SET (failover = on|off)</command>
> and
> -   <command>ALTER SUBSCRIPTION ... SET (two_phase =
> on|off)</command>
> +   <command>ALTER SUBSCRIPTION ... SET (two_phase = off)</command>
> 
> I wasn't sure why you chose to keep on|off here instead of true|false,
> since in subsequence patch 0003 you changed it true/false everywhere
> as discussed in previous reviews.
> 
> OTOH if you only did this to be consistent with the "failover=on|off"
> then that is OK; but in that case I might raise a separate hackers
> thread to propose those should also be changed to true|false for
> consistency with the parameer listed on the CREATE SUBSCRIPTION page.
> What do you think?

Yeah, I did not change here, because other parameters were notated as
on/off. I found you started the forked thread [1] so I will revise the patch
after it was accepted.

> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 2.3.
>   /*
> - * The changed two_phase option of the slot can't be rolled
> - * back.
> + * Altering the parameter from "true" to "false" within a
> + * transaction is prohibited. Since the apply worker does
> + * not alter the slot option to false, the backend must
> + * connect to the publisher and expressly change the
> + * parameter.
> + *
> + * There is no need to do something remarkable regarding
> + * the "false" to "true" case; the backend process alters
> + * subtwophase to LOGICALREP_TWOPHASE_STATE_PENDING once.
> + * After the subscription is enabled, a new logical
> + * replication worker requests to change the two_phase
> + * option of its slot when the initial data synchronization
> + * is done. The code path is the same as the case in which
> + * two_phase is initially set to true.
>   */
> 
> BEFORE
> ...worker requests to change the two_phase option of its slot when...
> 
> SUGGESTION
> ...worker requests to change the two_phase option of its slot from
> pending to true when...

Fixed.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 2.4.
> +#####################
> +# Check the case that prepared transactions exist on the publisher node.
> +#
> +# Since the two_phase is "off", then normally, this PREPARE will do nothing
> +# until the COMMIT PREPARED, but in this test, we toggle the two_phase to
> +# "true" again before the COMMIT PREPARED happens.
> 
> /Since the two_phase is "off"/Since the two_phase is "false"/

Fixed.

> 
> //////////
> patch v10-0003
> //////////
> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3.1. AlterSubscription
> 
> + * If two_phase was enabled, there is a possibility that
> + * transactions have already been PREPARE'd. They must be
> + * checked and rolled back.
>   */
>   if (!opts.twophase)
> 
> I think it will less ambiguous if you modify this to say "If two_phase
> was previously enabled"

Fixed.

> 
> ~~~
> 
> 3.2.
> if (!opts.twophase)
> {
> List *prepared_xacts;
> 
> /*
> * Altering the parameter from "true" to "false" within
> * a transaction is prohibited. Since the apply worker
> * does not alter the slot option to false, the backend
> * must connect to the publisher and expressly change
> * the parameter.
> *
> * There is no need to do something remarkable
> * regarding the "false" to "true" case; the backend
> * process alters subtwophase to
> * LOGICALREP_TWOPHASE_STATE_PENDING once. After the
> * subscription is enabled, a new logical replication
> * worker requests to change the two_phase option of
> * its slot when the initial data synchronization is
> * done. The code path is the same as the case in which
> * two_phase is initially set to true.
> */
> if (!opts.twophase)
> PreventInTransactionBlock(isTopLevel,
> "ALTER SUBSCRIPTION ... SET (two_phase = false)");
> 
> /*
> * To prevent prepared transactions from being
> * isolated, they must manually be aborted.
> */
> if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
> (prepared_xacts = GetGidListBySubid(subid)) != NIL)
> {
> /* Abort all listed transactions */
> foreach_ptr(char, gid, prepared_xacts)
> FinishPreparedTransaction(gid, false);
> 
> list_free_deep(prepared_xacts);
> }
> }
> 
> /* Change system catalog acoordingly */
> values[Anum_pg_subscription_subtwophasestate - 1] =
> CharGetDatum(opts.twophase ?
> LOGICALREP_TWOPHASE_STATE_PENDING :
> LOGICALREP_TWOPHASE_STATE_DISABLED);
> replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
> }
> 
> ~
> 
> Why is "if (!opts.twophase)" being checked at the top and then
> immediately being checed again here:
> + if (!opts.twophase)
> + PreventInTransactionBlock(isTopLevel,
> + "ALTER SUBSCRIPTION ... SET (two_phase = false)");

Oh, this was caused by wrong git operations.

> And then again here:
> CharGetDatum(opts.twophase ?
> LOGICALREP_TWOPHASE_STATE_PENDING :
> LOGICALREP_TWOPHASE_STATE_DISABLED);
> 
> There is no need to re-check a flag that was already checked, so
> clearly some of this logic/code is either wrong or redundant.

Right. I added a new variable to store the value to be changed. Thouth?

 > 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> (Let's change these on|off to true|false to match what you did already
> in patch 0002).
> 
> 3.3.
> +#####################
> +# Check the case that prepared transactions exist on the subscriber node
> +#
> +# If the two_phase is altering from "on" to "off" and there are prepared
> +# transactions on the subscriber, they must be aborted. This test checks it.
> 
> 
> /off/false/
> 
> /on/true/

Fixed.

> 
> ~~~
> 
> 3.4.
> +# Verify the prepared transaction has been replicated to the subscriber because
> +# two_phase is set to "on".
> 
> /on/true/

Fixed.

> 
> ~~~
> 
> 3.5.
> +# Toggle the two_phase to "off" before the COMMIT PREPARED
> +$node_subscriber->safe_psql(
> +    'postgres', "
> +    ALTER SUBSCRIPTION regress_sub DISABLE;
> +    ALTER SUBSCRIPTION regress_sub SET (two_phase = off);
> +    ALTER SUBSCRIPTION regress_sub ENABLE;");
> 
> /off/false/
> 
> /two_phase = off/two_phase = false/

Fixed.

> 
> ~~~
> 
> 3.6.
> +# Verify any prepared transactions are aborted because two_phase is changed
> to
> +# "off".
> 
> /off/false/

Fixed.

> 
> //////////
> patch v10-0004
> //////////
> 
> ======
> 4.g1. GENERAL - document rendering fails
> 
> FYI - The document failed to build after I apply patch 0003. Did you try it?
> 
> In my environment it reported some unbalanced tags:
> 
> ref/create_subscription.sgml:448: parser error : Opening and ending
> tag mismatch: link line 436 and para
>          </para>
>                 ^
> ref/create_subscription.sgml:449: parser error : Opening and ending
> tag mismatch: para line 435 and listitem
>         </listitem>
> 
> etc.

Oh, I forgot to run `make check`. Sorry. It seemed that I missed to close <link> tag.

> 
> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 4.1.
>       <para>
>        The <literal>two_phase</literal> parameter can only be altered when
> the
> -      subscription is disabled. When altering the parameter from
> <literal>true</literal>
> -      to <literal>false</literal>, the backend process checks for any
> incomplete
> -      prepared transactions done by the logical replication worker (from when
> -      <literal>two_phase</literal> parameter was still
> <literal>true</literal>)
> -      and, if any are found, those are aborted.
> +      subscription is disabled. Altering the parameter from
> <literal>true</literal>
> +      to <literal>false</literal> will give an error when when there are
> +      prepared transactions done by the logical replication worker. If you want
> +      to alter the parameter forcibly in this case,
> +      <link
> linkend="sql-createsubscription-params-with-force-alter"><literal>force_alter
> </literal></link>
> +      option must be set to <literal>true</literal> at the same time.
>       </para>
> 
> TYPO: "when when"

Removed.

> Why is necessary to say "at the same time"?

Not needed. Fixed.

> 
> ======
> doc/src/sgml/ref/create_subscription.sgml
> 
> 4.2.
> +       <varlistentry id="sql-createsubscription-params-with-force-alter">
> +        <term><literal>force_alter</literal>
> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies if the <link
> linkend="sql-altersubscription"><command>ALTER
> SUBSCRIPTION</command>
> +          can be forced to proceed instead of giving an error. There is
> +          currently only one scenario where this parameter has any effect:
> When
> +          altering <literal>two_phase</literal> option from
> <literal>true</literal>
> +          to <literal>false</literal> it is possible for there to be incomplete
> +          prepared transactions done by the logical replication worker (from
> +          when <literal>two_phase</literal> parameter was still
> <literal>true</literal>).
> +          If <literal>force_alter</literal> is <literal>false</literal>, then
> +          this will give an error; if <literal>force_alter</literal> is
> +          <literal>true</literal>, then the incomplete prepared transactions
> +          are aborted and the alter will proceed.
> +          The default is <literal>false</literal>.
> +         </para>
> +        </listitem>
> +       </varlistentry>
> 
> IMO this will be better broken into multiple paragraphs.
> 
> 1. Specifies...
> 2. There is...
> 3. The default is...

Separated.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> (Let's change all the on|off to true|false like you already did in patch 0002.
> 
> 4.3.
> +# Try altering the two_phase option to "off." The command will fail since there
> +# is a prepared transaction and the 'force_alter' option is not specified as
> +# true.
> +my $stdout;
> +my $stderr;
> 
> /off./false/

Fixed.

[1]: https://www.postgresql.org/message-id/CAHut%2BPs-RqrggaJU5w85BbeQzw9CLmmLgADVJoJ%3Dxx_4D5CWvw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 

Attachment

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: in parentesis is not usual on DOCs
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Slow catchup of 2PC (twophase) transactions on replica in LR