Thread: Slow catchup of 2PC (twophase) transactions on replica in LR

Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:

Dear All,

I'd like to present and talk about a problem when 2PC transactions are applied quite slowly on a replica during logical replication. There is a master and a replica with established logical replication from the master to the replica with twophase = true. With some load level on the master, the replica starts to lag behind the master, and the lag will be increasing. We have to significantly decrease the load on the master to allow replica to complete the catchup. Such problem may create significant difficulties in the production. The problem appears at least on REL_16_STABLE branch.

To reproduce the problem:

  • Setup logical replication from master to replica with subscription parameter twophase =  true.
  • Create some intermediate load on the master (use pgbench with custom sql with prepare+commit)
  • Optionally switch off the replica for some time (keep load on master).
  • Switch on the replica and wait until it reaches the master.

The replica will never reach the master with even some low load on the master. If to remove the load, the replica will reach the master for much greater time, than expected. I tried the same for regular transactions, but such problem doesn't appear even with a decent load.

I think, the main proplem of 2PC catchup bad performance - the lack of asynchronous commit support for 2PC. For regular transactions asynchronous commit is used on the replica by default (subscrition sycnronous_commit = off). It allows the replication worker process on the replica to avoid fsync (XLogFLush) and to utilize 100% CPU (the background wal writer or checkpointer will do fsync). I agree, 2PC are mostly used in multimaster configurations with two or more nodes which are performed synchronously, but when the node in catchup (node is not online in a multimaster cluster), asynchronous commit have to be used to speedup the catchup.

There is another thing that affects on the disbalance of the master and replica performance. When the master executes requestes from multiple clients, there is a fsync optimization takes place in XLogFlush. It allows to decrease the number of fsync in case when a number of parallel backends write to the WAL simultaneously. The replica applies received transactions in one thread sequentially, such optimization is not applied.

I see some possible solutions:

  • Implement asyncronous commit for 2PC transactions.
  • Do some hacking with enableFsync when it is possible.

I think, asynchronous commit support for 2PC transactions should significantly increase replica performance and help to solve this problem. I tried to implement it (like for usual transactions) but I've found another problem: 2PC state is stored in WAL on prepare, on commit we have to read 2PC state from WAL but the read is delayed until WAL is flushed by the background wal writer (read LSN should be less than flush LSN). Storing 2PC state in a shared memory (as it proposed earlier) may help.


I used the following query to monitor the catchup progress on the master:

SELECT sent_lsn, pg_current_wal_lsn() FROM pg_stat_replication;

I used the following script for pgbench to the master:

SELECT md5(random()::text) as mygid \gset
BEGIN;
DELETE FROM test WHERE v = pg_backend_pid();
INSERT INTO test(v) SELECT pg_backend_pid();
PREPARE TRANSACTION $$:mygid$$;
COMMIT PREPARED $$:mygid$$;

 

What do you think?

 

With best regards,

Vitaly Davydov

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Thu, Feb 22, 2024 at 6:59 PM Давыдов Виталий
<v.davydov@postgrespro.ru> wrote:
>
> I'd like to present and talk about a problem when 2PC transactions are applied quite slowly on a replica during
logicalreplication. There is a master and a replica with established logical replication from the master to the replica
withtwophase = true. With some load level on the master, the replica starts to lag behind the master, and the lag will
beincreasing. We have to significantly decrease the load on the master to allow replica to complete the catchup. Such
problemmay create significant difficulties in the production. The problem appears at least on REL_16_STABLE branch. 
>
> To reproduce the problem:
>
> Setup logical replication from master to replica with subscription parameter twophase =  true.
> Create some intermediate load on the master (use pgbench with custom sql with prepare+commit)
> Optionally switch off the replica for some time (keep load on master).
> Switch on the replica and wait until it reaches the master.
>
> The replica will never reach the master with even some low load on the master. If to remove the load, the replica
willreach the master for much greater time, than expected. I tried the same for regular transactions, but such problem
doesn'tappear even with a decent load. 
>
> I think, the main proplem of 2PC catchup bad performance - the lack of asynchronous commit support for 2PC. For
regulartransactions asynchronous commit is used on the replica by default (subscrition sycnronous_commit = off). It
allowsthe replication worker process on the replica to avoid fsync (XLogFLush) and to utilize 100% CPU (the background
walwriter or checkpointer will do fsync). I agree, 2PC are mostly used in multimaster configurations with two or more
nodeswhich are performed synchronously, but when the node in catchup (node is not online in a multimaster cluster),
asynchronouscommit have to be used to speedup the catchup. 
>

I don't see we do anything specific for 2PC transactions to make them
behave differently than regular transactions with respect to
synchronous_commit setting. What makes you think so? Can you pin point
the code you are referring to?

> There is another thing that affects on the disbalance of the master and replica performance. When the master executes
requestesfrom multiple clients, there is a fsync optimization takes place in XLogFlush. It allows to decrease the
numberof fsync in case when a number of parallel backends write to the WAL simultaneously. The replica applies received
transactionsin one thread sequentially, such optimization is not applied. 
>

Right, I think for this we need to implement parallel apply.

> I see some possible solutions:
>
> Implement asyncronous commit for 2PC transactions.
> Do some hacking with enableFsync when it is possible.
>

Can you be a bit more specific about what exactly you have in mind to
achieve the above solutions?

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Fri, Feb 23, 2024 at 12:29 AM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:

Dear All,

I'd like to present and talk about a problem when 2PC transactions are applied quite slowly on a replica during logical replication. There is a master and a replica with established logical replication from the master to the replica with twophase = true. With some load level on the master, the replica starts to lag behind the master, and the lag will be increasing. We have to significantly decrease the load on the master to allow replica to complete the catchup. Such problem may create significant difficulties in the production. The problem appears at least on REL_16_STABLE branch.

To reproduce the problem:

  • Setup logical replication from master to replica with subscription parameter twophase =  true.
  • Create some intermediate load on the master (use pgbench with custom sql with prepare+commit)
  • Optionally switch off the replica for some time (keep load on master).
  • Switch on the replica and wait until it reaches the master.

The replica will never reach the master with even some low load on the master. If to remove the load, the replica will reach the master for much greater time, than expected. I tried the same for regular transactions, but such problem doesn't appear even with a decent load.


I tried this setup and I do see that the logical subscriber does reach the master in a short time. I'm not sure what I'm missing. I stopped the logical subscriber in between while pgbench was running and then started it again and ran the following:
postgres=# SELECT sent_lsn, pg_current_wal_lsn() FROM pg_stat_replication;
 sent_lsn  | pg_current_wal_lsn
-----------+--------------------
 0/6793FA0 | 0/6793FA0 <=== caught up
(1 row)

My pgbench command:
pgbench postgres -p 6972 -c 2 -j 3 -f /home/ajin/test.sql -T 200 -P 5

my custom sql file:
cat test.sql
SELECT md5(random()::text) as mygid \gset
BEGIN;
DELETE FROM test WHERE v = pg_backend_pid();
INSERT INTO test(v) SELECT pg_backend_pid();
PREPARE TRANSACTION $$:mygid$$;
COMMIT PREPARED $$:mygid$$;

regards,
Ajin Cherian
Fujitsu Australia

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Ajin,

Thank you for your feedback. Could you please try to increase the number of clients (-c pgbench option) up to 20 or more? It seems, I forgot to specify it.

With best regards,
Vitaly Davydov
 
On Fri, Feb 23, 2024 at 12:29 AM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:

Dear All,

I'd like to present and talk about a problem when 2PC transactions are applied quite slowly on a replica during logical replication. There is a master and a replica with established logical replication from the master to the replica with twophase = true. With some load level on the master, the replica starts to lag behind the master, and the lag will be increasing. We have to significantly decrease the load on the master to allow replica to complete the catchup. Such problem may create significant difficulties in the production. The problem appears at least on REL_16_STABLE branch.

To reproduce the problem:

  • Setup logical replication from master to replica with subscription parameter twophase =  true.
  • Create some intermediate load on the master (use pgbench with custom sql with prepare+commit)
  • Optionally switch off the replica for some time (keep load on master).
  • Switch on the replica and wait until it reaches the master.

The replica will never reach the master with even some low load on the master. If to remove the load, the replica will reach the master for much greater time, than expected. I tried the same for regular transactions, but such problem doesn't appear even with a decent load.
 

 
I tried this setup and I do see that the logical subscriber does reach the master in a short time. I'm not sure what I'm missing. I stopped the logical subscriber in between while pgbench was running and then started it again and ran the following:
postgres=# SELECT sent_lsn, pg_current_wal_lsn() FROM pg_stat_replication;
 sent_lsn  | pg_current_wal_lsn
-----------+--------------------
 0/6793FA0 | 0/6793FA0 <=== caught up
(1 row)
 
My pgbench command:
pgbench postgres -p 6972 -c 2 -j 3 -f /home/ajin/test.sql -T 200 -P 5
 
my custom sql file:
cat test.sql
SELECT md5(random()::text) as mygid \gset
BEGIN;
DELETE FROM test WHERE v = pg_backend_pid();
INSERT INTO test(v) SELECT pg_backend_pid();
PREPARE TRANSACTION $$:mygid$$;
COMMIT PREPARED $$:mygid$$;
 
regards,
Ajin Cherian
Fujitsu Australia
 


 

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Amit,

Amit Kapila <amit.kapila16@gmail.com> wrote:

I don't see we do anything specific for 2PC transactions to make them behave differently than regular transactions with respect to synchronous_commit setting. What makes you think so? Can you pin point the code you are referring to?

Yes, sure. The function RecordTransactionCommitPrepared is called on prepared transaction commit (twophase.c). It calls XLogFlush unconditionally. The function RecordTransactionCommit (for regular transactions, xact.c) calls XLogFlush if synchronous_commit > OFF, otherwise it calls XLogSetAsyncXactLSN.

There is some comment in RecordTransactionCommitPrepared (by Bruce Momjian) that shows that async commit is not supported yet:
/*
* We don't currently try to sleep before flush here ... nor is there any
* support for async commit of a prepared xact (the very idea is probably
* a contradiction)
*/
/* Flush XLOG to disk */
XLogFlush(recptr);

Right, I think for this we need to implement parallel apply.

Yes, parallel apply is a good point. But, I believe, it will not work if asynchronous commit is not supported. You have only one receiver process which should dispatch incoming messages to parallel workers. I guess, you will never reach such rate of parallel execution on replica as on the master with multiple backends.
 

Can you be a bit more specific about what exactly you have in mind to achieve the above solutions?

My proposal is to implement async commit for 2PC transactions as it is for regular transactions. It should significantly speedup the catchup process. Then, think how to apply in parallel, which is much diffcult to do. The current problem is to get 2PC state from the WAL on commit prepared. At this moment, the WAL is not flushed yet, commit function waits until WAL with 2PC state is to be flushed. I just tried to do it in my sandbox and found such a problem. Inability to get 2PC state from unflushed WAL stops me right now. I think about possible solutions.

The idea with enableFsync is not a suitable solution, in general, I think. I just pointed it as an alternate idea. You just do enableFsync = false before prepare or commit prepared and do enableFsync = true after these functions. In this case, 2PC records will not be fsync-ed, but FlushPtr will be increased. Thus, 2PC state can be read from WAL on commit prepared without waiting. To make it work correctly, I guess, we have to do some additional work to keep more wal on the master and filter some duplicate transactions on the replica, if replica restarts during catchup.
​​​​​​
With best regards,
​​​​​​Vitaly Davydov

 

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Fri, Feb 23, 2024 at 10:41 PM Давыдов Виталий
<v.davydov@postgrespro.ru> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I don't see we do anything specific for 2PC transactions to make them behave differently than regular transactions
withrespect to synchronous_commit setting. What makes you think so? Can you pin point the code you are referring to? 
>
> Yes, sure. The function RecordTransactionCommitPrepared is called on prepared transaction commit (twophase.c). It
callsXLogFlush unconditionally. The function RecordTransactionCommit (for regular transactions, xact.c) calls XLogFlush
ifsynchronous_commit > OFF, otherwise it calls XLogSetAsyncXactLSN. 
>
> There is some comment in RecordTransactionCommitPrepared (by Bruce Momjian) that shows that async commit is not
supportedyet: 
> /*
> * We don't currently try to sleep before flush here ... nor is there any
> * support for async commit of a prepared xact (the very idea is probably
> * a contradiction)
> */
> /* Flush XLOG to disk */
> XLogFlush(recptr);
>

It seems this comment is added in the commit 4a78cdeb where we added
async commit support. I think the reason is probably that when the WAL
record for prepared is already flushed then what will be the idea of
async commit here?

> Right, I think for this we need to implement parallel apply.
>
> Yes, parallel apply is a good point. But, I believe, it will not work if asynchronous commit is not supported. You
haveonly one receiver process which should dispatch incoming messages to parallel workers. I guess, you will never
reachsuch rate of parallel execution on replica as on the master with multiple backends. 
>
>
> Can you be a bit more specific about what exactly you have in mind to achieve the above solutions?
>
> My proposal is to implement async commit for 2PC transactions as it is for regular transactions. It should
significantlyspeedup the catchup process. Then, think how to apply in parallel, which is much diffcult to do. The
currentproblem is to get 2PC state from the WAL on commit prepared. At this moment, the WAL is not flushed yet, commit
functionwaits until WAL with 2PC state is to be flushed. I just tried to do it in my sandbox and found such a problem.
Inabilityto get 2PC state from unflushed WAL stops me right now. I think about possible solutions. 
>

At commit prepared, it seems we read prepare's WAL record, right? If
so, it is not clear to me do you see a problem with a flush of
commit_prepared or reading WAL for prepared or both of these.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Amit,

Thank you for your interest in the discussion!

On Monday, February 26, 2024 16:24 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:
 

I think the reason is probably that when the WAL record for prepared is already flushed then what will be the idea of async commit here?

I think, the idea of async commit should be applied for both transactions: PREPARE and COMMIT PREPARED, which are actually two separate local transactions. For both these transactions we may call XLogSetAsyncXactLSN on commit instead of XLogFlush when async commit is enabled. When I use async commit, I mean to apply async commit to local transactions, not to a twophase (prepared) transaction itself.
 

At commit prepared, it seems we read prepare's WAL record, right? If so, it is not clear to me do you see a problem with a flush of commit_prepared or reading WAL for prepared or both of these.

The problem with reading WAL is due to async commit of PREPARE TRANSACTION which saves 2PC in the WAL. At the moment of COMMIT PREPARED the WAL with PREPARE TRANSACTION 2PC state may not be XLogFlush-ed yet. So, PREPARE TRANSACTION should wait until its 2PC state is flushed.

I did some experiments with saving 2PC state in the local memory of logical replication worker and, I think, it worked and demonstrated much better performance. Logical replication worker utilized up to 100% CPU. I'm just concerned about possible problems with async commit for twophase transactions.

To be more specific, I've attached a patch to support async commit for twophase. It is not the final patch but it is presented only for discussion purposes. There were some attempts to save 2PC in memory in past but it was rejected. Now, there might be the second round to discuss it.

With best regards,
Vitaly

 
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Feb 27, 2024 at 4:49 PM Давыдов Виталий
<v.davydov@postgrespro.ru> wrote:
>
> Thank you for your interest in the discussion!
>
> On Monday, February 26, 2024 16:24 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> I think the reason is probably that when the WAL record for prepared is already flushed then what will be the idea of
asynccommit here? 
>
> I think, the idea of async commit should be applied for both transactions: PREPARE and COMMIT PREPARED, which are
actuallytwo separate local transactions. For both these transactions we may call XLogSetAsyncXactLSN on commit instead
ofXLogFlush when async commit is enabled. When I use async commit, I mean to apply async commit to local transactions,
notto a twophase (prepared) transaction itself. 
>
>
> At commit prepared, it seems we read prepare's WAL record, right? If so, it is not clear to me do you see a problem
witha flush of commit_prepared or reading WAL for prepared or both of these. 
>
> The problem with reading WAL is due to async commit of PREPARE TRANSACTION which saves 2PC in the WAL. At the moment
ofCOMMIT PREPARED the WAL with PREPARE TRANSACTION 2PC state may not be XLogFlush-ed yet. 
>

As we do XLogFlush() at the time of prepare then why it is not
available? OR are you talking about this state after your idea/patch
where you are trying to make both Prepare and Commit_prepared records
async?

 So, PREPARE TRANSACTION should wait until its 2PC state is flushed.
>
> I did some experiments with saving 2PC state in the local memory of logical replication worker and, I think, it
workedand demonstrated much better performance. Logical replication worker utilized up to 100% CPU. I'm just concerned
aboutpossible problems with async commit for twophase transactions. 
>
> To be more specific, I've attached a patch to support async commit for twophase. It is not the final patch but it is
presentedonly for discussion purposes. There were some attempts to save 2PC in memory in past but it was rejected. 
>

It would be good if you could link those threads.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Amit,

On Tuesday, February 27, 2024 16:00 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:

As we do XLogFlush() at the time of prepare then why it is not available? OR are you talking about this state after your idea/patch where you are trying to make both Prepare and Commit_prepared records async?

Right, I'm talking about my patch where async commit is implemented. There is no such problem with reading 2PC from not flushed WAL in the vanilla because XLogFlush is called unconditionally, as you've described. But an attempt to add some async stuff leads to the problem of reading not flushed WAL. It is why I store 2pc state in the local memory in my patch.

It would be good if you could link those threads.

Sure, I will find and add some links to the discussions from past.

Thank you!

With best regards,
Vitaly
 
On Tue, Feb 27, 2024 at 4:49 PM Давыдов Виталий
<v.davydov@postgrespro.ru> wrote:
>
> Thank you for your interest in the discussion!
>
> On Monday, February 26, 2024 16:24 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> I think the reason is probably that when the WAL record for prepared is already flushed then what will be the idea of async commit here?
>
> I think, the idea of async commit should be applied for both transactions: PREPARE and COMMIT PREPARED, which are actually two separate local transactions. For both these transactions we may call XLogSetAsyncXactLSN on commit instead of XLogFlush when async commit is enabled. When I use async commit, I mean to apply async commit to local transactions, not to a twophase (prepared) transaction itself.
>
>
> At commit prepared, it seems we read prepare's WAL record, right? If so, it is not clear to me do you see a problem with a flush of commit_prepared or reading WAL for prepared or both of these.
>
> The problem with reading WAL is due to async commit of PREPARE TRANSACTION which saves 2PC in the WAL. At the moment of COMMIT PREPARED the WAL with PREPARE TRANSACTION 2PC state may not be XLogFlush-ed yet.
>

As we do XLogFlush() at the time of prepare then why it is not
available? OR are you talking about this state after your idea/patch
where you are trying to make both Prepare and Commit_prepared records
async?

So, PREPARE TRANSACTION should wait until its 2PC state is flushed.
>
> I did some experiments with saving 2PC state in the local memory of logical replication worker and, I think, it worked and demonstrated much better performance. Logical replication worker utilized up to 100% CPU. I'm just concerned about possible problems with async commit for twophase transactions.
>
> To be more specific, I've attached a patch to support async commit for twophase. It is not the final patch but it is presented only for discussion purposes. There were some attempts to save 2PC in memory in past but it was rejected.
>

It would be good if you could link those threads.

--
With Regards,
Amit Kapila.

 


 

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Dear All,

Consider, please, my patch for async commit for twophase transactions. It can be applicable when catchup performance is not enought with publication parameter twophase = on.

The key changes are:
  • Use XLogSetAsyncXactLSN instead of XLogFlush as it is for usual transactions.
  • In case of async commit only, save 2PC state in the pg_twophase file (but not fsync it) in addition to saving in the WAL. The file is used as an alternative to storing 2pc state in the memory.
  • On recovery, reject pg_twophase files with future xids.
Probably, 2PC async commit should be enabled by a GUC (not implemented in the patch).

With best regards,
Vitaly


 
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Heikki Linnakangas
Date:
On 29/02/2024 19:34, Давыдов Виталий wrote:
> Dear All,
> 
> Consider, please, my patch for async commit for twophase transactions. 
> It can be applicable when catchup performance is not enought with 
> publication parameter twophase = on.
> 
> The key changes are:
> 
>   * Use XLogSetAsyncXactLSN instead of XLogFlush as it is for usual
>     transactions.
>   * In case of async commit only, save 2PC state in the pg_twophase file
>     (but not fsync it) in addition to saving in the WAL. The file is
>     used as an alternative to storing 2pc state in the memory.
>   * On recovery, reject pg_twophase files with future xids.
> 
> Probably, 2PC async commit should be enabled by a GUC (not implemented 
> in the patch).

In a nutshell, this changes PREPARE TRANSACTION so that if 
synchronous_commit is 'off', the PREPARE TRANSACTION is not fsync'd to 
disk. So if you crash after the PREPARE TRANSACTION has returned, the 
transaction might be lost. I think that's completely unacceptable.

If you're ok to lose the prepared state of twophase transactions on 
crash, why don't you create the subscription with 'two_phase=off' to 
begin with?

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Heikki,

Thank you for the reply.

On Tuesday, March 05, 2024 12:05 MSK, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
 
In a nutshell, this changes PREPARE TRANSACTION so that if
synchronous_commit is 'off', the PREPARE TRANSACTION is not fsync'd to
disk. So if you crash after the PREPARE TRANSACTION has returned, the
transaction might be lost. I think that's completely unacceptable.​​​​​

You are right, the prepared transaction might be lost after crash. The same may happen with regular transactions that are not fsync-ed on replica in logical replication by default. The subscription parameter synchronous_commit is OFF by default. I'm not sure, is there some auto recovery for regular transactions? I think, the main difference between these two cases - how to manually recover when some PREPARE TRANSACTION or COMMIT PREPARED are lost. For regular transactions, some updates or deletes in tables on replica may be enough to fix the problem. For twophase transactions, it may be harder to fix it by hands, but it is possible, I believe. If you create a custom solution that is based on twophase transactions (like multimaster) such auto recovery may happen automatically. Another solution is to ignore errors on commit prepared if the corresponding prepared tx is missing. I don't know other risks that may happen with async commit of twophase transactions.
 
If you're ok to lose the prepared state of twophase transactions on
crash, why don't you create the subscription with 'two_phase=off' to
begin with?
In usual work, the subscription has two_phase = on. I have to change this option at catchup stage only, but this parameter can not be altered. There was a patch proposal in past to implement altering of two_phase option, but it was rejected. I think, the recreation of the subscription with two_phase = off will not work.

I believe, async commit for twophase transactions on catchup will significantly improve the catchup performance. It is worth to think about such feature.

P.S. We might introduce a GUC option to allow async commit for twophase transactions. By default, sync commit will be applied for twophase transactions, as it is now.

With best regards,
Vitaly Davydov

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Mar 5, 2024 at 7:59 PM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:
>
> Thank you for the reply.
>
> On Tuesday, March 05, 2024 12:05 MSK, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>
> In a nutshell, this changes PREPARE TRANSACTION so that if
> synchronous_commit is 'off', the PREPARE TRANSACTION is not fsync'd to
> disk. So if you crash after the PREPARE TRANSACTION has returned, the
> transaction might be lost. I think that's completely unacceptable.
>
>
> You are right, the prepared transaction might be lost after crash. The same may happen with regular transactions that
arenot fsync-ed on replica in logical replication by default. The subscription parameter synchronous_commit is OFF by
default.I'm not sure, is there some auto recovery for regular transactions? 
>

Unless the commit WAL is not flushed, we wouldn't have updated the
replication origin's LSN and neither the walsender would increase the
confirmed_flush_lsn for the corresponding slot till the commit is
flushed on subscriber. So, if the subscriber crashed before flushing
the commit record, server should send the same transaction again. The
same should be true for prepared transaction stuff as well.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Wed, Mar 6, 2024 at 1:29 AM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:
In usual work, the subscription has two_phase = on. I have to change this option at catchup stage only, but this parameter can not be altered. There was a patch proposal in past to implement altering of two_phase option, but it was rejected. I think, the recreation of the subscription with two_phase = off will not work.



The altering of two_phase was restricted because if there was a previously prepared transaction on the subscriber when the two_phase was on, and then it was turned off, the apply worker on the subscriber would re-apply the transaction a second time and this might result in an inconsistent replica.
Here's a patch that allows toggling two_phase option provided that there are no pending uncommitted prepared transactions on the subscriber for that subscription.

Thanks to Kuroda-san for working on the patch.

regards,
Ajin Cherian
Fujitsu Australia
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Thu, Apr 4, 2024 at 10:53 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Wed, Mar 6, 2024 at 1:29 AM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:
>>
>> In usual work, the subscription has two_phase = on. I have to change this option at catchup stage only, but this
parametercan not be altered. There was a patch proposal in past to implement altering of two_phase option, but it was
rejected.I think, the recreation of the subscription with two_phase = off will not work. 
>>
>>
>
> The altering of two_phase was restricted because if there was a previously prepared transaction on the subscriber
whenthe two_phase was on, and then it was turned off, the apply worker on the subscriber would re-apply the transaction
asecond time and this might result in an inconsistent replica. 
> Here's a patch that allows toggling two_phase option provided that there are no pending uncommitted prepared
transactionson the subscriber for that subscription. 
>

I think this would probably be better than the current situation but
can we think of a solution to allow toggling the value of two_phase
even when prepared transactions are present? Can you please summarize
the reason for the problems in doing that and the solutions, if any?

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Thu, Apr 4, 2024 at 4:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think this would probably be better than the current situation but
can we think of a solution to allow toggling the value of two_phase
even when prepared transactions are present? Can you please summarize
the reason for the problems in doing that and the solutions, if any?

--
With Regards,
Amit Kapila.

Updated the patch, as it wasn't addressing updating of two-phase in the remote slot.

 Currently the main issue that needs to be handled is the handling of pending prepared transactions while the two_phase is altered. I see 3 issues with the current approach.

1. Uncommitted prepared transactions when toggling two_phase from true to false
  When two_phase was true, prepared transactions were decoded at PREPARE time and send to the subscriber, which is then prepared on the subscriber with a new gid. Once the two_phase is toggled to false, then the COMMIT PREPARED on the publisher is converted to commit and the entire transaction is decoded and sent to the subscriber. This will   leave the previously prepared transaction pending.

2. Uncommitted prepared transactions when toggling two_phase form false to true
  When two_phase was false, prepared transactions were ignored and not decoded at PREPARE time on the publisher. Once the two_phase is toggled to true, the apply worker and the walsender are restarted and a replication is restarted from a new "start_decoding_at" LSN. Now, this new "start_decoding_at" could be past the LSN of the PREPARE record and if so, the PREPARE record is skipped and not send to the subscriber. Look at comments in DecodeTXNNeedSkip() for detail.  Later when the user issues COMMIT PREPARED, this is decoded and sent to the subscriber. but there is no prepared transaction on the subscriber, and this fails because the  corresponding gid of the transaction couldn't be found.

3. While altering the two_phase of the subscription, it is required to also alter the two_phase field of the slot on the primary. The subscription cannot remotely alter the two_phase option of the slot when the subscription is  enabled, as the slot is owned by the walsender on the publisher side. 

Possible solutions for the 3 problems:

1. While toggling two_phase from true to false, we could probably get list of prepared transactions for this subscriber id and rollback/abort the prepared transactions. This will allow the transactions to be re-applied like a normal transaction when the commit comes. Alternatively, if this isn't appropriate doing it in the ALTER SUBSCRIPTION context, we could store the xids of all prepared transactions of this subscription in a list and when the corresponding xid is being committed by the apply worker, prior to commit, we make sure the previously prepared transaction is rolled back. But this would add the overhead of checking this list every time a transaction is committed by the apply worker.

2. No solution yet.

3. We could mandate that the altering of two_phase state only be done after disabling the subscription, just like how it is handled for failover option. Let me know your thoughts.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Fri, Apr 5, 2024 at 4:59 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 4:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think this would probably be better than the current situation but
>> can we think of a solution to allow toggling the value of two_phase
>> even when prepared transactions are present? Can you please summarize
>> the reason for the problems in doing that and the solutions, if any?
>>
>
>
> Updated the patch, as it wasn't addressing updating of two-phase in the remote slot.
>

Vitaly, does the minimal solution provided by the proposed patch
(Allow to alter two_phase option of a subscriber provided no
uncommitted
prepared transactions are pending on that subscription.) address your use case?

>  Currently the main issue that needs to be handled is the handling of pending prepared transactions while the
two_phaseis altered. I see 3 issues with the current approach. 
>
> 1. Uncommitted prepared transactions when toggling two_phase from true to false
>   When two_phase was true, prepared transactions were decoded at PREPARE time and send to the subscriber, which is
thenprepared on the subscriber with a new gid. Once the two_phase is toggled to false, then the COMMIT PREPARED on the
publisheris converted to commit and the entire transaction is decoded and sent to the subscriber. This will   leave the
previouslyprepared transaction pending. 
>
> 2. Uncommitted prepared transactions when toggling two_phase form false to true
>   When two_phase was false, prepared transactions were ignored and not decoded at PREPARE time on the publisher. Once
thetwo_phase is toggled to true, the apply worker and the walsender are restarted and a replication is restarted from a
new"start_decoding_at" LSN. Now, this new "start_decoding_at" could be past the LSN of the PREPARE record and if so,
thePREPARE record is skipped and not send to the subscriber. Look at comments in DecodeTXNNeedSkip() for detail.  Later
whenthe user issues COMMIT PREPARED, this is decoded and sent to the subscriber. but there is no prepared transaction
onthe subscriber, and this fails because the  corresponding gid of the transaction couldn't be found. 
>
> 3. While altering the two_phase of the subscription, it is required to also alter the two_phase field of the slot on
theprimary. The subscription cannot remotely alter the two_phase option of the slot when the subscription is  enabled,
asthe slot is owned by the walsender on the publisher side. 
>

Thanks for summarizing the reasons for not allowing altering the
two_pc property for a subscription.

> Possible solutions for the 3 problems:
>
> 1. While toggling two_phase from true to false, we could probably get a list of prepared transactions for this
subscriberid and rollback/abort the prepared transactions. This will allow the transactions to be re-applied like a
normaltransaction when the commit comes. Alternatively, if this isn't appropriate doing it in the ALTER SUBSCRIPTION
context,we could store the xids of all prepared transactions of this subscription in a list and when the corresponding
xidis being committed by the apply worker, prior to commit, we make sure the previously prepared transaction is rolled
back.But this would add the overhead of checking this list every time a transaction is committed by the apply worker. 
>

In the second solution, if you check at the time of commit whether
there exists a prior prepared transaction then won't we end up
applying the changes twice? I think we can first try to achieve it at
the time of Alter Subscription because the other solution can add
overhead at each commit?

> 2. No solution yet.
>

One naive idea is that on the publisher we can remember whether the
prepare has been sent and if so then only send commit_prepared,
otherwise send the entire transaction. On the subscriber-side, we
somehow, need to ensure before applying the first change whether the
corresponding transaction is already prepared and if so then skip the
changes and just perform the commit prepared. One drawback of this
approach is that after restart, the prepare flag wouldn't be saved in
the memory and we end up sending the entire transaction again. One way
to avoid this overhead is that the publisher before sending the entire
transaction checks with subscriber whether it has a prepared
transaction corresponding to the current commit. I understand that
this is not a good idea even if it works but I don't have any better
ideas. What do you think?

> 3. We could mandate that the altering of two_phase state only be done after disabling the subscription, just like how
itis handled for failover option. 
>

makes sense.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Hi Amit, Ajin, All

Thank you for the patch and the responses. I apologize for my delayed answer due to some curcumstances.

On Wednesday, April 10, 2024 14:18 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:

Vitaly, does the minimal solution provided by the proposed patch (Allow to alter two_phase option of a subscriber provided no uncommitted prepared transactions are pending on that subscription.) address your use case?

In general, the idea behind the patch seems to be suitable for my case. Furthermore, the case of two_phase switch from false to true with uncommitted pending prepared transactions probably never happens in my case. The switch from false to true means that the replica completes the catchup from the master and switches to the normal mode when it participates in the multi-node configuration. There should be no uncommitted pending prepared transactions at the moment of the switch to the normal mode.

I'm going to try this patch. Give me please some time to investigate the patch. I will come with some feedback a little bit later.

Thank you for your help!

With best regards,
Vitaly Davydov


 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> One naive idea is that on the publisher we can remember whether the
> prepare has been sent and if so then only send commit_prepared,
> otherwise send the entire transaction. On the subscriber-side, we
> somehow, need to ensure before applying the first change whether the
> corresponding transaction is already prepared and if so then skip the
> changes and just perform the commit prepared. One drawback of this
> approach is that after restart, the prepare flag wouldn't be saved in
> the memory and we end up sending the entire transaction again. One way
> to avoid this overhead is that the publisher before sending the entire
> transaction checks with subscriber whether it has a prepared
> transaction corresponding to the current commit. I understand that
> this is not a good idea even if it works but I don't have any better
> ideas. What do you think?

Alternative idea is that worker pass a list of prepared transaction as new
option in START_REPLICATION. This can reduce the number of inter-node
communications, but sometimes the list may be huge.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> Vitaly, does the minimal solution provided by the proposed patch
> (Allow to alter two_phase option of a subscriber provided no
> uncommitted
> prepared transactions are pending on that subscription.) address your use case?

I think we do not have to handle cases which there are prepared transactions on
publisher/subscriber, as the first step. It leads additional complexity and we
do not have smarter solutions, especially for problem 2.
IIUC it meets the Vitaly's condition, right?

> > 1. While toggling two_phase from true to false, we could probably get a list of
> prepared transactions for this subscriber id and rollback/abort the prepared
> transactions. This will allow the transactions to be re-applied like a normal
> transaction when the commit comes. Alternatively, if this isn't appropriate doing it
> in the ALTER SUBSCRIPTION context, we could store the xids of all prepared
> transactions of this subscription in a list and when the corresponding xid is being
> committed by the apply worker, prior to commit, we make sure the previously
> prepared transaction is rolled back. But this would add the overhead of checking
> this list every time a transaction is committed by the apply worker.
> >
> 
> In the second solution, if you check at the time of commit whether
> there exists a prior prepared transaction then won't we end up
> applying the changes twice? I think we can first try to achieve it at
> the time of Alter Subscription because the other solution can add
> overhead at each commit?

Yeah, at least the second solution might be problematic. I prototyped
the first one and worked well. However, to make the feature more consistent,
it is prohibit to exist prepared transactions on subscriber for now.
We can ease based on the requirement.

> > 2. No solution yet.
> >
> 
> One naive idea is that on the publisher we can remember whether the
> prepare has been sent and if so then only send commit_prepared,
> otherwise send the entire transaction. On the subscriber-side, we
> somehow, need to ensure before applying the first change whether the
> corresponding transaction is already prepared and if so then skip the
> changes and just perform the commit prepared. One drawback of this
> approach is that after restart, the prepare flag wouldn't be saved in
> the memory and we end up sending the entire transaction again. One way
> to avoid this overhead is that the publisher before sending the entire
> transaction checks with subscriber whether it has a prepared
> transaction corresponding to the current commit. I understand that
> this is not a good idea even if it works but I don't have any better
> ideas. What do you think?

I considered but not sure it is good to add such mechanism. Your idea requires
additional wait-loop, which might lead bugs and unexpected behavior. And it may
degrade the performance based on the network environment.
As for the another solution (worker sends a list of prepared transactions), it
is also not so good because list of prepared transactions may be huge.

Based on above, I think we can reject the case for now.

FYI - We also considered the idea which walsender waits until all prepared transactions
are resolved before decoding and sending changes, but it did not work well
- the restarted walsender sent only COMMIT PREPARED record for transactions which
have been prepared before disabling the subscription. This happened because
1) if the two_phase option of slots is false, the confirmed_flush can be ahead of
   PREPARE record, and
2) after the altering and restarting, start_decoding_at becomes same as
   confirmed_flush and records behind this won't be decoded.

> > 3. We could mandate that the altering of two_phase state only be done after
> disabling the subscription, just like how it is handled for failover option.
> >
> 
> makes sense.

OK, this spec was added.

According to above, I updated the patch with Ajin.
0001 - extends ALTER SUBSCRIPTION statement. A tab-completion was added.
0002 - mandates the subscription has been disabled. Since no need to change 
       AtEOXact_ApplyLauncher(), the change is reverted.
       If no objections, this can be included to 0001.
0003 - checks whether there are transactions prepared by the worker. If found,
       rejects the ALTER SUBSCRIPTION command.
0004 - checks whether there are transactions prepared on publisher. The backend
       connects to the publisher and confirms it. If found, rejects the ALTER
       SUBSCRIPTION command.
0005 - adds TAP test for it.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Apr 15, 2024 at 1:28 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > Vitaly, does the minimal solution provided by the proposed patch
> > (Allow to alter two_phase option of a subscriber provided no
> > uncommitted
> > prepared transactions are pending on that subscription.) address your use case?
>
> I think we do not have to handle cases which there are prepared transactions on
> publisher/subscriber, as the first step. It leads additional complexity and we
> do not have smarter solutions, especially for problem 2.
> IIUC it meets the Vitaly's condition, right?
>
> > > 1. While toggling two_phase from true to false, we could probably get a list of
> > prepared transactions for this subscriber id and rollback/abort the prepared
> > transactions. This will allow the transactions to be re-applied like a normal
> > transaction when the commit comes. Alternatively, if this isn't appropriate doing it
> > in the ALTER SUBSCRIPTION context, we could store the xids of all prepared
> > transactions of this subscription in a list and when the corresponding xid is being
> > committed by the apply worker, prior to commit, we make sure the previously
> > prepared transaction is rolled back. But this would add the overhead of checking
> > this list every time a transaction is committed by the apply worker.
> > >
> >
> > In the second solution, if you check at the time of commit whether
> > there exists a prior prepared transaction then won't we end up
> > applying the changes twice? I think we can first try to achieve it at
> > the time of Alter Subscription because the other solution can add
> > overhead at each commit?
>
> Yeah, at least the second solution might be problematic. I prototyped
> the first one and worked well. However, to make the feature more consistent,
> it is prohibit to exist prepared transactions on subscriber for now.
> We can ease based on the requirement.
>
> > > 2. No solution yet.
> > >
> >
> > One naive idea is that on the publisher we can remember whether the
> > prepare has been sent and if so then only send commit_prepared,
> > otherwise send the entire transaction. On the subscriber-side, we
> > somehow, need to ensure before applying the first change whether the
> > corresponding transaction is already prepared and if so then skip the
> > changes and just perform the commit prepared. One drawback of this
> > approach is that after restart, the prepare flag wouldn't be saved in
> > the memory and we end up sending the entire transaction again. One way
> > to avoid this overhead is that the publisher before sending the entire
> > transaction checks with subscriber whether it has a prepared
> > transaction corresponding to the current commit. I understand that
> > this is not a good idea even if it works but I don't have any better
> > ideas. What do you think?
>
> I considered but not sure it is good to add such mechanism. Your idea requires
> additional wait-loop, which might lead bugs and unexpected behavior. And it may
> degrade the performance based on the network environment.
> As for the another solution (worker sends a list of prepared transactions), it
> is also not so good because list of prepared transactions may be huge.
>
> Based on above, I think we can reject the case for now.
>
> FYI - We also considered the idea which walsender waits until all prepared transactions
> are resolved before decoding and sending changes, but it did not work well
> - the restarted walsender sent only COMMIT PREPARED record for transactions which
> have been prepared before disabling the subscription. This happened because
> 1) if the two_phase option of slots is false, the confirmed_flush can be ahead of
>    PREPARE record, and
> 2) after the altering and restarting, start_decoding_at becomes same as
>    confirmed_flush and records behind this won't be decoded.
>

I don't understand the exact problem you are facing. IIUC, if the
commit is after start_decoding_at point and prepare was before it, we
expect to send the entire transaction followed by a commit record. The
restart_lsn should be before the start of such a transaction and we
should have recorded the changes in the reorder buffer.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Давыдов Виталий
Date:
Dear All,

On Wednesday, April 10, 2024 17:16 MSK, Давыдов Виталий <v.davydov@postgrespro.ru> wrote:
 
Hi Amit, Ajin, All

Thank you for the patch and the responses. I apologize for my delayed answer due to some curcumstances.

On Wednesday, April 10, 2024 14:18 MSK, Amit Kapila <amit.kapila16@gmail.com> wrote:

Vitaly, does the minimal solution provided by the proposed patch (Allow to alter two_phase option of a subscriber provided no uncommitted prepared transactions are pending on that subscription.) address your use case?

In general, the idea behind the patch seems to be suitable for my case. Furthermore, the case of two_phase switch from false to true with uncommitted pending prepared transactions probably never happens in my case. The switch from false to true means that the replica completes the catchup from the master and switches to the normal mode when it participates in the multi-node configuration. There should be no uncommitted pending prepared transactions at the moment of the switch to the normal mode.

I'm going to try this patch. Give me please some time to investigate the patch. I will come with some feedback a little bit later.

I looked at the patch and realized that I can't try it easily in the near future because the solution I'm working on is based on PG16 or earlier. This patch is not easily applicable to the older releases. I have to port my solution to the master, which is not done yet. I apologize for that - so much work should be done before applying the patch. BTW, I tested the idea with async 2PC commit on my side and it seems to work fine in my case. Anyway, I agree, the idea with altering of subscription seems the best one but much harder to implement.

To summarise my case of a synchronous multimaster where twophase is used to implement global transactions:
  • Replica may have prepared but not committed transactions when I toggle subscription twophase from true to false. In this case, all prepared transactions may be aborted on the replica before altering the subscription.
  • Replica will not have prepared transactions when subscription is toggled from false to true. In this scenario, the replica completes the catchup (with twophase=off) and becomes the part of the multi-nodal cluster and is ready to accept new 2PC transactions. All the new pending transactions will wait until replica responds. But it may work differently for some other solutions. In general, it would be great to allow toggling for all scenarious.
Just interested, does anyone tried to reproduce the problem with slow catchup of twophase transactions (pgbench should be used with big number of clients)? I haven't seen any messages from anyone other that me that the problem takes place.​​​​

Thank you for your help!

With best regards,
Vitaly










 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> > FYI - We also considered the idea which walsender waits until all prepared
> transactions
> > are resolved before decoding and sending changes, but it did not work well
> > - the restarted walsender sent only COMMIT PREPARED record for
> transactions which
> > have been prepared before disabling the subscription. This happened because
> > 1) if the two_phase option of slots is false, the confirmed_flush can be ahead of
> >    PREPARE record, and
> > 2) after the altering and restarting, start_decoding_at becomes same as
> >    confirmed_flush and records behind this won't be decoded.
> >
> 
> I don't understand the exact problem you are facing. IIUC, if the
> commit is after start_decoding_at point and prepare was before it, we
> expect to send the entire transaction followed by a commit record. The
> restart_lsn should be before the start of such a transaction and we
> should have recorded the changes in the reorder buffer.

This behavior is right for two_phase = false case. But if the parameter is
altered between PREPARE and COMMIT PREPARED, there is a possibility that only
COMMIT PREPARED is sent. As the first place, the executed workload is below.

1. created a subscription with (two_phase = false)
2. prepared a transaction on publisher
3. disabled the subscription once
4. altered the subscription to two_phase = true
5. enabled the subscription again
6. did COMMIT PREPARED on the publisher

-> Apply worker would raise an ERROR while applying COMMIT PREPARED record:
ERROR:  prepared transaction with identifier "pg_gid_XXX_YYY" does not exist

Below part describes why the ERROR occurred.

======

### Regarding 1) the confirmed_flush can be ahead of PREPARE record,

If two_phase is off, as you might know, confirmed_flush can be ahead of PREPARE
record by keepalive mechanism.

Walsender sometimes sends a keepalive message in WalSndKeepalive(). Here the LSN
is written, which is lastly decoded record. Since the PREPARE record is skipped
(just handled by ReorderBufferProcessXid()), sometimes the written LSN in the
message can be ahead of PREPARE record. If the WAL records are aligned like below,
the LSN can point CHECKPOINT_ONLINE.

...
INSERT
PREPARE txn1
CHECKPOINT_ONLINE
...

On worker side, when it receives the keepalive, it compares the LSN in the
message and lastly received LSN, and advance last_received. Then, the worker replies
to the walsender, and at that time it replies that last_recevied record has been
flushed on the subscriber. See send_feedback().
 
On publisher, when the walsender receives the message from subscriber, it reads
the message and advance the confirmed_flush to the written value. If the walsender
sends LSN which locates ahead PREPARE, the confirmed flush is updated as well.

### Regarding 2) after the altering, records behind the confirmed_flush are not decoded

Then, at decoding phase. The snapshot builder determines the point where decoding
is resumed, as start_decoding_at. After the restart, the value is same as
confirmed_flush of the slot. Since the confiremed_fluish is ahead of PREPARE,
the start_decoding_at becomes ahead as well, so whole of prepared transactions
are not decoded.

======

Attached zip file contains the PoC and used script. You can refer what I really did.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Apr 16, 2024 at 7:48 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > > FYI - We also considered the idea which walsender waits until all prepared
> > transactions
> > > are resolved before decoding and sending changes, but it did not work well
> > > - the restarted walsender sent only COMMIT PREPARED record for
> > transactions which
> > > have been prepared before disabling the subscription. This happened because
> > > 1) if the two_phase option of slots is false, the confirmed_flush can be ahead of
> > >    PREPARE record, and
> > > 2) after the altering and restarting, start_decoding_at becomes same as
> > >    confirmed_flush and records behind this won't be decoded.
> > >
> >
> > I don't understand the exact problem you are facing. IIUC, if the
> > commit is after start_decoding_at point and prepare was before it, we
> > expect to send the entire transaction followed by a commit record. The
> > restart_lsn should be before the start of such a transaction and we
> > should have recorded the changes in the reorder buffer.
>
> This behavior is right for two_phase = false case. But if the parameter is
> altered between PREPARE and COMMIT PREPARED, there is a possibility that only
> COMMIT PREPARED is sent.
>

Can you please once consider the idea shared by me at [1] (One naive
idea is that on the publisher .....) to solve this problem?

[1] -
https://www.postgresql.org/message-id/CAA4eK1K1fSkeK%3Dkc26G5cq87vQG4%3D1qs_b%2Bno4%2Bep654SeBy1w%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Tue, Apr 16, 2024 at 1:31 AM Давыдов Виталий <v.davydov@postgrespro.ru> wrote:
Dear All,
Just interested, does anyone tried to reproduce the problem with slow catchup of twophase transactions (pgbench should be used with big number of clients)? I haven't seen any messages from anyone other that me that the problem takes place.


 
 Yes, I was able to reproduce the slow catchup of twophase transactions with pgbench with 20 clients.

regards,
Ajin Cherian
Fujitsu Australia

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Tue, Apr 16, 2024 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

>

Can you please once consider the idea shared by me at [1] (One naive
idea is that on the publisher .....) to solve this problem?

[1] - https://www.postgresql.org/message-id/CAA4eK1K1fSkeK%3Dkc26G5cq87vQG4%3D1qs_b%2Bno4%2Bep654SeBy1w%40mail.gmail.com



Expanding on Amit's idea, we found out that there is already a mechanism in code to fully decode prepared transactions prior to a defined LSN where two_phase is enabled using the "two_phase_at" LSN in the slot. Look at ReorderBufferFinishPrepared() on how this is done. This code was not working as expected in our patch because
we were setting two_phase on the slot to true as soon as the alter command was received. This was not the correct way, initially when two_phase is enabled, the two_phase changes to pending state and two_phase option on the slot should only be set to true when two_phase moves from pending to enabled. This will happen once the replication is restarted with two_phase option. Look at code in  CreateDecodingContext() on how "two_phase_at" is set in the slot when done this way. So we changed the code to not remotely alter two_phase when toggling from false to true. With this change, now even if there are pending transactions on the publisher when toggling two_phase from false to true, these pending transactions will be fully decoded and sent once the commit prepared is decoded as the pending prepared transactions are prior to the "two_phase_at" LSN. With this patch, now we are able to handle both pending prepared transactions when altering two_phase from true to false as well as false to true.

Attaching the patch for your review and comments. Big thanks to Kuroda-san for also working on the patch.

regards,
Ajin Cherian
Fujitsu Australia.
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Ajin Cherian
Date:


On Thu, Apr 18, 2024 at 4:26 PM Ajin Cherian <itsajin@gmail.com> wrote:

Attaching the patch for your review and comments. Big thanks to Kuroda-san for also working on the patch.


Looking at this a bit more, maybe rolling back all prepared transactions on the subscriber when toggling two_phase from true to false might not be desirable for the customer. Maybe we should have an option for customers to control whether transactions should be rolled back or not. Maybe transactions should only be rolled back if a "force" option is also set.
What do people think?

regards,
Ajin Cherian
Fujitsu Australia

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vitaly,

> I looked at the patch and realized that I can't try it easily in the near future
> because the solution I'm working on is based on PG16 or earlier. This patch is
> not easily applicable to the older releases. I have to port my solution to the
> master, which is not done yet.

We also tried to port our patch for PG16, but the largest barrier was that a
replication command ALTER_SLOT is not supported. Since the slot option two_phase
can't be modified, it is difficult to skip decoding PREPARE command even when
altering the option from true to false.
IIUC, Adding a new feature (e.g., replication command) for minor updates is generally
prohibited

We must consider another approach for backpatching, but we do not have solutions
for now.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 

 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear hackers,

> Looking at this a bit more, maybe rolling back all prepared transactions on the
> subscriber when toggling two_phase from true to false might not be desirable
> for the customer. Maybe we should have an option for customers to control
> whether transactions should be rolled back or not. Maybe transactions should
> only be rolled back if a "force" option is also set. What do people think?

And here is a patch for adds new option "force_alter" (better name is very welcome).
It could be used only when altering two_phase option. Let me share examples.

Assuming that there are logical replication system with two_phase = on, and
there are prepared transactions:

```
subscriber=# SELECT * FROM pg_prepared_xacts ;
 transaction |       gid        |           prepared            |  owner   | database 
-------------+------------------+-------------------------------+----------+----------
         741 | pg_gid_16390_741 | 2024-04-22 08:02:34.727913+00 | postgres | postgres
         742 | pg_gid_16390_742 | 2024-04-22 08:02:34.729486+00 | postgres | postgres
(2 rows)
```

At that time, altering two_phase to false alone will be failed:

```
subscriber=# ALTER SUBSCRIPTION sub DISABLE ;
ALTER SUBSCRIPTION
subscriber=# ALTER SUBSCRIPTION sub SET (two_phase = off);
ERROR:  cannot alter two_phase = false when there are prepared transactions
```

It succeeds if force_alter is also expressly set. Prepared transactions will be
aborted at that time.

```
subscriber=# ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);
ALTER SUBSCRIPTION
subscriber=# SELECT * FROM pg_prepared_xacts ;
 transaction | gid | prepared | owner | database 
-------------+-----+----------+-------+----------
(0 rows)
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
> Dear Vitaly,
> 
> > I looked at the patch and realized that I can't try it easily in the near future
> > because the solution I'm working on is based on PG16 or earlier. This patch is
> > not easily applicable to the older releases. I have to port my solution to the
> > master, which is not done yet.
> 
> We also tried to port our patch for PG16, but the largest barrier was that a
> replication command ALTER_SLOT is not supported. Since the slot option
> two_phase
> can't be modified, it is difficult to skip decoding PREPARE command even when
> altering the option from true to false.
> IIUC, Adding a new feature (e.g., replication command) for minor updates is
> generally
> prohibited
> 
> We must consider another approach for backpatching, but we do not have
> solutions
> for now.

Attached patch set is a ported version for PG16, which breaks ABI. This can be used
for testing purpose, but it won't be pushed to REL_16_STABLE.
At least, this patchset can pass my github CI.

Can you apply and check whether your issue is solved?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 
 

Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Vitaly Davydov"
Date:
Dear Hayato,

On Monday, April 22, 2024 15:54 MSK, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote:
 
> Dear Vitaly,
>
> > I looked at the patch and realized that I can't try it easily in the near future
> > because the solution I'm working on is based on PG16 or earlier. This patch is
> > not easily applicable to the older releases. I have to port my solution to the
> > master, which is not done yet.
>
> We also tried to port our patch for PG16, but the largest barrier was that a
> replication command ALTER_SLOT is not supported. Since the slot option
> two_phase
> can't be modified, it is difficult to skip decoding PREPARE command even when
> altering the option from true to false.
> IIUC, Adding a new feature (e.g., replication command) for minor updates is
> generally
> prohibited
>
...

Attached patch set is a ported version for PG16, which breaks ABI. This can be used
for testing purpose, but it won't be pushed to REL_16_STABLE.
At least, this patchset can pass my github CI.

Can you apply and check whether your issue is solved?​​​​​​
It is fantastic. Thank you for your help! I will definitely try your patch. I need some time to test and incorporate it. I also plan to port my stuff to the master branch to simplify testing of patches.

With best regards,
​​​​​Vitaly Davydov

 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear hackers,

Per recent commit (b29cbd3da), our patch needed to be rebased.
Here is an updated version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 
 

Attachment
Hi, Here are some review comments for patch v6-0001

======
Commit message

1.
This patch allows user to alter two_phase option

/allows user/allows the user/

/to alter two_phase option/to alter the 'two_phase' option/

======
doc/src/sgml/ref/alter_subscription.sgml

2.
<literal>two_phase</literal> can be altered only for disabled subscription.

SUGGEST
The <literal>two_phase</literal> parameter can only be altered when
the subscription is disabled.

======
src/backend/access/transam/twophase.c

3. checkGid
+
+/*
+ * checkGid
+ */
+static bool
+checkGid(char *gid, Oid subid)
+{
+ int ret;
+ Oid subid_written,
+ xid;
+
+ ret = sscanf(gid, "pg_gid_%u_%u", &subid_written, &xid);
+
+ if (ret != 2 || subid != subid_written)
+ return false;
+
+ return true;
+}

3a.
The function comment should give more explanation of what it does. I
think this function is the counterpart of the TwoPhaseTransactionGid()
function of worker.c so the comment can say that too.

~

3b.
Indeed, perhaps the function name should be similar to
TwoPhaseTransactionGid. e.g. call it like
IsTwoPhaseTransactionGidForSubid?

~

3c.
Probably 'xid' should be TransactionId instead of Oid.

~

3d.
Why not have a single return?

SUGGESTION
return (ret == 2 && subid = subid_written);

~

3e.
I am wondering if the existing TwoPhaseTransactionGid function
currently in worker.c should be moved here because IMO these 2
functions belong together and twophase.c seems the right place to put
them.

~~~

4.
+LookupGXactBySubid(Oid subid)
+{
+ bool found = false;
+
+ LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ for (int i = 0; i < TwoPhaseState->numPrepXacts; i++)
+ {
+ GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+
+ /* Ignore not-yet-valid GIDs. */
+ if (gxact->valid && checkGid(gxact->gid, subid))
+ {
+ found = true;
+ break;
+ }
+ }
+ LWLockRelease(TwoPhaseStateLock);
+ return found;
+}

AFAIK the gxact also has the 'xid' available, so won't it be better to
pass BOTH the 'xid' and 'subid' to the checkGid so you can do a full
comparison instead of comparing only the subid part of the gid?

======
src/backend/commands/subscriptioncmds.c

5. AlterSubscription

+ /* XXX */
+ if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
+ {

The "XXX" comment looks like it is meant to say something more...

~~~

6.
+ /*
+ * two_phase can be only changed for disabled
+ * subscriptions
+ */
+ if (form->subenabled)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for enabled subscription",
+ "two_phase")));

6a.
Should this have a more comprehensive comment giving the reason like
the 'failover' option has?

~~~

6b.
Maybe this should include a "translator" comment to say don't
translate the option name.

~~~

7.
+ /* Check whether the number of prepared transactions */
+ if (!opts.twophase &&
+ form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
+ LookupGXactBySubid(subid))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot disable two_phase when uncommitted prepared
transactions present")));
+

7a.
The first comment seems to be an incomplete sentence. I think it
should say something a bit like:
two_phase cannot be disabled if there are any uncommitted prepared
transactions present.

~

7b.
Also, if ereport occurs what is the user supposed to do about it?
Shouldn't the ereport include some errhint with appropriate advice?

~~~

8.
+ /*
+ * The changed failover option of the slot can't be rolled
+ * back.
+ */
+ PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
(two_phase)");
+
+ /* Change system catalog acoordingly */
+ values[Anum_pg_subscription_subtwophasestate - 1] =
+ CharGetDatum(opts.twophase ?
+ LOGICALREP_TWOPHASE_STATE_PENDING :
+ LOGICALREP_TWOPHASE_STATE_DISABLED);
+ replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
+ }

Typo I think: /failover option/two_phase option/

======
.../libpqwalreceiver/libpqwalreceiver.c

9.
 static void
 libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
- bool failover)
+ bool two_phase, bool failover)

Same comment as mentioned elsewhere (#15), IMO the new 'two_phase'
parameter should be last.

======
src/backend/replication/logical/launcher.c

10.
+/*
+ * Stop all the subscription workers.
+ */
+void
+logicalrep_workers_stop(Oid subid)
+{
+ List    *subworkers;
+ ListCell   *lc;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+ subworkers = logicalrep_workers_find(subid, false);
+ LWLockRelease(LogicalRepWorkerLock);
+ foreach(lc, subworkers)
+ {
+ LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc);
+
+ logicalrep_worker_stop(w->subid, w->relid);
+ }
+ list_free(subworkers);
+}

I was confused by the logicalrep_workers_find(subid, false). IIUC the
'false' means everything (instead of 'only_running') but then I don't
know why we want to "stop" anything that is NOT running. OTOH I see
that this code was extracted from where it was previously inlined in
subscriptioncmds.c, so maybe the 'false' is necessary for another
reason? At least maybe some explanatory comment is needed for why you
are passing this flag as false?

======
src/backend/replication/logical/worker.c

11.
- /* two-phase should not be altered */
+ /* two-phase should not be altered while the worker exists */
  Assert(newsub->twophasestate == MySubscription->twophasestate);
/should not/cannot/

======
src/backend/replication/slot.c

12.
 void
-ReplicationSlotAlter(const char *name, bool failover)
+ReplicationSlotAlter(const char *name, bool two_phase, bool failover)

Same comment as mentioned elsewhere (#15), IMO the new 'two_phase'
parameter should be last.

~~~

13.
+ if (MyReplicationSlot->data.two_phase != two_phase)
+ {
+ SpinLockAcquire(&MyReplicationSlot->mutex);
+ MyReplicationSlot->data.two_phase = two_phase;
+ SpinLockRelease(&MyReplicationSlot->mutex);
+
+ update_slot = true;
+ }
+
+
  if (MyReplicationSlot->data.failover != failover)
  {
  SpinLockAcquire(&MyReplicationSlot->mutex);
  MyReplicationSlot->data.failover = failover;
  SpinLockRelease(&MyReplicationSlot->mutex);

+ update_slot = true;
+ }

13a.
Doesn't it make more sense for the whole check/set to be "atomic",
i.e. put the mutex also around the check?

SUGGEST
SpinLockAcquire(&MyReplicationSlot->mutex);
if (MyReplicationSlot->data.two_phase != two_phase)
{
  MyReplicationSlot->data.two_phase = two_phase;
  update_slot = true;
}
SpinLockRelease(&MyReplicationSlot->mutex);

~

Also, (if you agree with the above) why not include both checks
(two_phase and failover) within the same mutex instead of
acquiring/releasing it twice:

SUGGEST
SpinLockAcquire(&MyReplicationSlot->mutex);
if (MyReplicationSlot->data.two_phase != two_phase)
{
  MyReplicationSlot->data.two_phase = two_phase;
  update_slot = true;
}
if (MyReplicationSlot->data.failover != failover)
{
  MyReplicationSlot->data.failover = failover;
  update_slot = true;
}
SpinLockAcquire(&MyReplicationSlot->mutex);

~~~

13b.
There are double blank lines after the first if-block.

======
src/backend/replication/walsender.c

14.
 static void
-ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd,
+   bool *two_phase, bool *failover)

Same comment as mentioned elsewhere (#15), IMO the new 'two_phase'
parameter should be last.

======
src/include/replication/walreceiver.h

15.
 typedef void (*walrcv_alter_slot_fn) (WalReceiverConn *conn,
    const char *slotname,
+   bool two_phase,
    bool failover);

Somehow, I feel it is more normal to add the new code (the 'two_phase'
parameter) at the END, instead of into the middle of the existing
parameters. It also keeps it alphabetical which makes it consistent
with other places like the tab-completion code.

This comment about swapping the order (putting new stuff last) will
propagate changes to lots of other related places. I refer to this
comment in a few other places in this post but there are probably more
the same that I missed.

======
src/test/regress/sql/subscription.sql

16.
I know you do this already in the TAP test, but doesn't the test case
to demonstrate that 'two-phase' option can be altered when the
subscription is disabled actually belong here in the regression
instead?

======
src/test/subscription/t/021_twophase.pl

17.
+# Disable the subscription and alter it to two_phase = false,
+# verify that the altered subscription reflects the two_phase option.

/verify/then verify/

~~~

18.
+# Now do a prepare on publisher and make sure that it is not replicated.
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_publisher->safe_psql(
+       'postgres', qq{
+    BEGIN;
+    INSERT INTO tab_copy VALUES (100);
+    PREPARE TRANSACTION 'newgid';
+ });
+

18a.
/on publisher/on the publisher/

18b.
What is that "DROP SUBSCRIPTION tap_sub" doing here? It seems
misplaced under this comment.

~~~

19.
+# Make sure that there is 0 prepared transaction on the subscriber
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts;");
+is($result, qq(0), 'transaction is prepared on subscriber');

19a.
SUGGESTION
Make sure there are no prepared transactions on the subscriber

~~~

19b.
/'transaction is prepared on subscriber'/'should be no prepared
transactions on subscriber'/

~~~

20.
+# Made sure that the commited transaction is replicated.

/Made sure/Make sure/

/commited/committed/

~~~

21.
+# Make sure that the two-phase is enabled on the subscriber
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT subtwophasestate FROM pg_subscription WHERE subname = 'tap_sub_copy';"
+);
+is($result, qq(e), 'two-phase is disabled');

The 'two-phase is disabled' is the identical message used in the
opposite case earlier, so something is amiss. Maybe this one should
say 'two-phase should be enabled' and the earlier counterpart should
say 'two-phase should be disabled'.

======
Kind Regards,
Peter Smith
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! Here are updated patches.
I updated patches only for HEAD.

> ======
> Commit message
> 
> 1.
> This patch allows user to alter two_phase option
> 
> /allows user/allows the user/
> 
> /to alter two_phase option/to alter the 'two_phase' option/

Fixed.

> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.
> <literal>two_phase</literal> can be altered only for disabled subscription.
> 
> SUGGEST
> The <literal>two_phase</literal> parameter can only be altered when
> the subscription is disabled.

Fixed.

> ======
> src/backend/access/transam/twophase.c
> 
> 3. checkGid
> +
> +/*
> + * checkGid
> + */
> +static bool
> +checkGid(char *gid, Oid subid)
> +{
> + int ret;
> + Oid subid_written,
> + xid;
> +
> + ret = sscanf(gid, "pg_gid_%u_%u", &subid_written, &xid);
> +
> + if (ret != 2 || subid != subid_written)
> + return false;
> +
> + return true;
> +}
> 
> 3a.
> The function comment should give more explanation of what it does. I
> think this function is the counterpart of the TwoPhaseTransactionGid()
> function of worker.c so the comment can say that too.

Comments were updated.

> 3b.
> Indeed, perhaps the function name should be similar to
> TwoPhaseTransactionGid. e.g. call it like
> IsTwoPhaseTransactionGidForSubid?

Replaced to IsTwoPhaseTransactionGidForSubid().

> 3c.
> Probably 'xid' should be TransactionId instead of Oid.

Right, fixed.

> 3d.
> Why not have a single return?
> 
> SUGGESTION
> return (ret == 2 && subid = subid_written);

Fixed.

> 3e.
> I am wondering if the existing TwoPhaseTransactionGid function
> currently in worker.c should be moved here because IMO these 2
> functions belong together and twophase.c seems the right place to put
> them.

+1, moved.

> ~~~
> 
> 4.
> +LookupGXactBySubid(Oid subid)
> +{
> + bool found = false;
> +
> + LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
> + for (int i = 0; i < TwoPhaseState->numPrepXacts; i++)
> + {
> + GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
> +
> + /* Ignore not-yet-valid GIDs. */
> + if (gxact->valid && checkGid(gxact->gid, subid))
> + {
> + found = true;
> + break;
> + }
> + }
> + LWLockRelease(TwoPhaseStateLock);
> + return found;
> +}
> 
> AFAIK the gxact also has the 'xid' available, so won't it be better to
> pass BOTH the 'xid' and 'subid' to the checkGid so you can do a full
> comparison instead of comparing only the subid part of the gid?

IIUC, the xid written in the gxact means the transaction id on the subscriber,
but formatted GID has xid on the publisher. So the value cannot be used.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 5. AlterSubscription
> 
> + /* XXX */
> + if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
> + {
> 
> The "XXX" comment looks like it is meant to say something more...

This flag was used only for me, removed.

> ~~~
> 
> 6.
> + /*
> + * two_phase can be only changed for disabled
> + * subscriptions
> + */
> + if (form->subenabled)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot set %s for enabled subscription",
> + "two_phase")));
> 
> 6a.
> Should this have a more comprehensive comment giving the reason like
> the 'failover' option has?

Modified, but it is almost the same as failover's one.

> 6b.
> Maybe this should include a "translator" comment to say don't
> translate the option name.

Hmm, but other parts in AlterSubscription() does not have.
For now, I kept current style.

> ~~~
> 
> 7.
> + /* Check whether the number of prepared transactions */
> + if (!opts.twophase &&
> + form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED
> &&
> + LookupGXactBySubid(subid))
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot disable two_phase when uncommitted prepared
> transactions present")));
> +
> 
> 7a.
> The first comment seems to be an incomplete sentence. I think it
> should say something a bit like:
> two_phase cannot be disabled if there are any uncommitted prepared
> transactions present.

Modified, but this part would be replaced by upcoming patches.

> 7b.
> Also, if ereport occurs what is the user supposed to do about it?
> Shouldn't the ereport include some errhint with appropriate advice?

The hint was added, but this part would be replaced by upcoming patches.

> ~~~
> 
> 8.
> + /*
> + * The changed failover option of the slot can't be rolled
> + * back.
> + */
> + PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
> (two_phase)");
> +
> + /* Change system catalog acoordingly */
> + values[Anum_pg_subscription_subtwophasestate - 1] =
> + CharGetDatum(opts.twophase ?
> + LOGICALREP_TWOPHASE_STATE_PENDING :
> + LOGICALREP_TWOPHASE_STATE_DISABLED);
> + replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
> + }
> 
> Typo I think: /failover option/two_phase option/

Right, fixed.

> ======
> .../libpqwalreceiver/libpqwalreceiver.c
> 
> 9.
>  static void
>  libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
> - bool failover)
> + bool two_phase, bool failover)
> 
> Same comment as mentioned elsewhere (#15), IMO the new 'two_phase'
> parameter should be last.

Fixed. Also, some ordering of declarations and if-blocks were also changed.
In later part, I did not reply similar comments but I addressed all of them.

> ======
> src/backend/replication/logical/launcher.c
> 
> 10.
> +/*
> + * Stop all the subscription workers.
> + */
> +void
> +logicalrep_workers_stop(Oid subid)
> +{
> + List    *subworkers;
> + ListCell   *lc;
> +
> + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> + subworkers = logicalrep_workers_find(subid, false);
> + LWLockRelease(LogicalRepWorkerLock);
> + foreach(lc, subworkers)
> + {
> + LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc);
> +
> + logicalrep_worker_stop(w->subid, w->relid);
> + }
> + list_free(subworkers);
> +}
> 
> I was confused by the logicalrep_workers_find(subid, false). IIUC the
> 'false' means everything (instead of 'only_running') but then I don't
> know why we want to "stop" anything that is NOT running. OTOH I see
> that this code was extracted from where it was previously inlined in
> subscriptioncmds.c, so maybe the 'false' is necessary for another
> reason? At least maybe some explanatory comment is needed for why you
> are passing this flag as false?

Sorry, let me give time for more investigation around here. For now,
I added "XXX" mark.
I think it is listed just in case, but there may be a timing issue.

> ======
> src/backend/replication/logical/worker.c
> 
> 11.
> - /* two-phase should not be altered */
> + /* two-phase should not be altered while the worker exists */
>   Assert(newsub->twophasestate == MySubscription->twophasestate);
> /should not/cannot/

Fixed.

> ~~~
> 
> 13.
> + if (MyReplicationSlot->data.two_phase != two_phase)
> + {
> + SpinLockAcquire(&MyReplicationSlot->mutex);
> + MyReplicationSlot->data.two_phase = two_phase;
> + SpinLockRelease(&MyReplicationSlot->mutex);
> +
> + update_slot = true;
> + }
> +
> +
>   if (MyReplicationSlot->data.failover != failover)
>   {
>   SpinLockAcquire(&MyReplicationSlot->mutex);
>   MyReplicationSlot->data.failover = failover;
>   SpinLockRelease(&MyReplicationSlot->mutex);
> 
> + update_slot = true;
> + }
> 
> 13a.
> Doesn't it make more sense for the whole check/set to be "atomic",
> i.e. put the mutex also around the check?
> 
> SUGGEST
> SpinLockAcquire(&MyReplicationSlot->mutex);
> if (MyReplicationSlot->data.two_phase != two_phase)
> {
>   MyReplicationSlot->data.two_phase = two_phase;
>   update_slot = true;
> }
> SpinLockRelease(&MyReplicationSlot->mutex);
> 
> ~
> 
> Also, (if you agree with the above) why not include both checks
> (two_phase and failover) within the same mutex instead of
> acquiring/releasing it twice:
> 
> SUGGEST
> SpinLockAcquire(&MyReplicationSlot->mutex);
> if (MyReplicationSlot->data.two_phase != two_phase)
> {
>   MyReplicationSlot->data.two_phase = two_phase;
>   update_slot = true;
> }
> if (MyReplicationSlot->data.failover != failover)
> {
>   MyReplicationSlot->data.failover = failover;
>   update_slot = true;
> }
> SpinLockAcquire(&MyReplicationSlot->mutex);

Hmm. According to comments atop ReplicationSlot, backends which own the slot do
not have to set mutex for reading attributes. Concurrent backends, which do not
acquire the slot, must set the mutex lock before the read. Based on the manner,
I want to keep current style.

```
* - Individual fields are protected by mutex where only the backend owning
 * the slot is authorized to update the fields from its own slot.  The
 * backend owning the slot does not need to take this lock when reading its
 * own fields, while concurrent backends not owning this slot should take the
 * lock when reading this slot's data.
 */
typedef struct ReplicationSlot
```

> 13b.
> There are double blank lines after the first if-block.

Removed.

> ======
> src/test/regress/sql/subscription.sql
> 
> 16.
> I know you do this already in the TAP test, but doesn't the test case
> to demonstrate that 'two-phase' option can be altered when the
> subscription is disabled actually belong here in the regression
> instead?

Actually it cannot be done at main regression test. Because altering two_phase
requires the connection between pub/sub, but it is not established in subscription.sql
file. Succeeded case for altering failover has not been tested neither, and
I think they have same reason.

> src/test/subscription/t/021_twophase.pl
> 
> 17.
> +# Disable the subscription and alter it to two_phase = false,
> +# verify that the altered subscription reflects the two_phase option.
> 
> /verify/then verify/

Fixed.

> 18.
> +# Now do a prepare on publisher and make sure that it is not replicated.
> +$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
> +$node_publisher->safe_psql(
> +       'postgres', qq{
> +    BEGIN;
> +    INSERT INTO tab_copy VALUES (100);
> +    PREPARE TRANSACTION 'newgid';
> + });
> +
> 
> 18a.
> /on publisher/on the publisher/

Fixed.

> 18b.
> What is that "DROP SUBSCRIPTION tap_sub" doing here? It seems
> misplaced under this comment.

The subscription must be dropped because it also prepares a transaction.
Moved before the test case and added comments.

> 19.
> +# Make sure that there is 0 prepared transaction on the subscriber
> +$result = $node_subscriber->safe_psql('postgres',
> + "SELECT count(*) FROM pg_prepared_xacts;");
> +is($result, qq(0), 'transaction is prepared on subscriber');
> 
> 19a.
> SUGGESTION
> Make sure there are no prepared transactions on the subscriber

Fixed.

> 19b.
> /'transaction is prepared on subscriber'/'should be no prepared
> transactions on subscriber'/

Replaced/

> 20.
> +# Made sure that the commited transaction is replicated.
> 
> /Made sure/Make sure/
> 
> /commited/committed/

Fixed.

> 21.
> +# Make sure that the two-phase is enabled on the subscriber
> +$result = $node_subscriber->safe_psql('postgres',
> + "SELECT subtwophasestate FROM pg_subscription WHERE subname =
> 'tap_sub_copy';"
> +);
> +is($result, qq(e), 'two-phase is disabled');
> 
> The 'two-phase is disabled' is the identical message used in the
> opposite case earlier, so something is amiss. Maybe this one should
> say 'two-phase should be enabled' and the earlier counterpart should
> say 'two-phase should be disabled'.

Both of them were fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



Attachment
Hi Kuroda-san,

Thanks for addressing most of my v6-0001 review comments.

Below are some minor follow-up comments for v7-0001.

======
src/backend/access/transam/twophase.c

1. IsTwoPhaseTransactionGidForSubid

+/*
+ * IsTwoPhaseTransactionGidForSubid
+ * Check whether the given GID is formed by TwoPhaseTransactionGid.
+ */
+static bool
+IsTwoPhaseTransactionGidForSubid(Oid subid, char *gid)

I think the function comment should mention something about 'subid'.

SUGGESTION
Check whether the given GID (as formed by TwoPhaseTransactionGid) is
for the specified 'subid'.

======
src/backend/commands/subscriptioncmds.c

2. AlterSubscription

+ if (!opts.twophase &&
+ form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
+ LookupGXactBySubid(subid))
+ /* Add error message */
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot disable two_phase when uncommitted prepared
transactions present"),
+ errhint("Resolve these transactions and try again")));

The comment "/* Add error message */" seems unnecessary.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Hi, Here are some review comments for v7-0002

======
Commit message

1.
IIUC there is quite a lot of subtlety and details about why the slot
option needs to be changed only when altering "true" to "false", but
not when altering "false" to "true".

It also should explain why PreventInTransactionBlock is only needed
when altering two_phase "true" to "false".

Please include a commit message to describe all those tricky details.

======
src/backend/commands/subscriptioncmds.c

2. AlterSubscription

- PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
(two_phase)");
+ if (!opts.twophase)
+ PreventInTransactionBlock(isTopLevel,
+   "ALTER SUBSCRIPTION ... SET (two_phase = off)");

IMO this needs a comment to explain why PreventInTransactionBlock is
only needed when changing the 'two_phase' option from on to off.

~~~

3. AlterSubscription

/*
* Try to acquire the connection necessary for altering slot.
*
* This has to be at the end because otherwise if there is an error while
* doing the database operations we won't be able to rollback altered
* slot.
*/
if (replaces[Anum_pg_subscription_subfailover - 1] ||
replaces[Anum_pg_subscription_subtwophasestate - 1])
{
bool must_use_password;
char    *err;
WalReceiverConn *wrconn;
bool failover_needs_to_be_updated;
bool two_phase_needs_to_be_updated;

/* Load the library providing us libpq calls. */
load_file("libpqwalreceiver", false);

/* Try to connect to the publisher. */
must_use_password = sub->passwordrequired && !sub->ownersuperuser;
wrconn = walrcv_connect(sub->conninfo, true, true, must_use_password,
sub->name, &err);
if (!wrconn)
ereport(ERROR,
(errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("could not connect to the publisher: %s", err)));

/*
* Consider which slot option must be altered.
*
* We must alter the failover option whenever subfailover is updated.
* Two_phase, however, is altered only when changing true to false.
*/
failover_needs_to_be_updated =
replaces[Anum_pg_subscription_subfailover - 1];
two_phase_needs_to_be_updated =
(replaces[Anum_pg_subscription_subtwophasestate - 1] &&
!opts.twophase);

PG_TRY();
{
if (two_phase_needs_to_be_updated || failover_needs_to_be_updated)
walrcv_alter_slot(wrconn, sub->slotname,
  failover_needs_to_be_updated ? &opts.failover : NULL,
  two_phase_needs_to_be_updated ? &opts.twophase : NULL);
}
PG_FINALLY();
{
walrcv_disconnect(wrconn);
}
PG_END_TRY();
}

3a.
The block comment "Consider which slot option must be altered..." says
WHEN those options need to be updated, but it doesn't say WHY. e.g.
why only update the 'two_phase' when it is being disabled but not when
it is being enabled? In other words, I think there needs to be more
background/reason details given in this comment.

~~~

3b.
Can't those 2 new variable assignments be done up-front and guard this
entire "if-block" instead of the current replaces[] guarding it? Then
the code is somewhat simplified.

SUGGESTION:
/*
 * <improved comment here to explain these variables>
 */
update_failover = replaces[Anum_pg_subscription_subfailover - 1];
update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate -
1] && !opts.twophase);

/*
 * Try to acquire the connection necessary for altering slot.
 *
 * This has to be at the end because otherwise if there is an error while
 * doing the database operations we won't be able to rollback altered
 * slot.
 */
if (update_failover || update_two_phase)
{
  ...

  /* Load the library providing us libpq calls. */
  load_file("libpqwalreceiver", false);

  /* Try to connect to the publisher. */
  must_use_password = sub->passwordrequired && !sub->ownersuperuser;
  wrconn = walrcv_connect(sub->conninfo, true, true,
must_use_password, sub->name, &err);
  if (!wrconn)
    ereport(ERROR, ...);

  PG_TRY();
  {
    walrcv_alter_slot(wrconn, sub->slotname,
      update_failover ? &opts.failover : NULL,
      update_two_phase ? &opts.twophase : NULL);
  }
  PG_FINALLY();
  {
    walrcv_disconnect(wrconn);
  }
  PG_END_TRY();
}

======
.../libpqwalreceiver/libpqwalreceiver.c

4.
+ appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
+ quote_identifier(slotname));
+
+ if (failover)
+ appendStringInfo(&cmd, "FAILOVER %s ",
+ (*failover) ? "true" : "false");
+
+ if (two_phase)
+ appendStringInfo(&cmd, "TWO_PHASE %s%s ",
+ (*two_phase) ? "true" : "false",
+ failover ? ", " : "");
+
+ appendStringInfoString(&cmd, ");");

4a.
IIUC the comma logic here was broken in v7 when you swapped the order.
Anyway, IMO it will be better NOT to try combining that comma logic
with the existing appendStringInfo. Doing it separately is both easier
and less error-prone.

Furthermore, the parentheses like "(*two_phase)" instead of just
"*two_phase" seemed a bit overkill.

SUGGESTION:
+ if (failover)
+ appendStringInfo(&cmd, "FAILOVER %s",
+ *failover ? "true" : "false");
+
+   if (failover && two_phase)
+       appendStringInfo(&cmd, ", ");
+
+ if (two_phase)
+ appendStringInfo(&cmd, "TWO_PHASE %s",
+ *two_phase ? "true" : "false");
+
+ appendStringInfoString(&cmd, " );");

~~

4b.
Like I said above, IMO the current separator logic in v7 is broken. So
it is a bit concerning the tests all passed anyway. How did that
happen? I think this indicates that there needs to be an additional
test scenario where both 'failover' and 'two_phase' get altered at the
same time so this code gets exercised properly.

======
src/test/subscription/t/099_twophase_added.pl

5.
+# Define pre-existing tables on both nodes

Why say they are "pre-existing"? They are not pre-existing because you
are creating them right here!

~~~

6.
+######
+# Check the case that prepared transactions exist on publisher node
+######

I think this needs a slightly more detailed comment.

SUGGESTION (this is just an example, but you can surely improve it)

# Check the case that prepared transactions exist on the publisher node.
#
# Since two_phase is "off", then normally this PREPARE will do nothing until
# the COMMIT PREPARED, but in this test, we toggle the two_phase to "on" again
# before the COMMIT PREPARED happens.

~~~

7.
Maybe this test case needs a few more one-line comments for each of
the sub-steps. e.g.:

# prepare a transaction to insert some rows to the table

# verify the prepared tx is not yet replicated to the subscriber
(because 'two_phase = off')

# toggle the two_phase to 'on' *before* the COMMIT PREPARED

# verify the inserted rows got replicated ok

~~~

8.
IIUC this test will behave the same even if you DON'T do the toggle
'two_phase = on'. So I wonder is there something more you can do to
test this scenario more convincingly?

======
Kind Regards,
Peter Smith
Fujitsu Australia



Hi, Here are some review comments for the patch v7-0003.

======
Commit Message

1.
The patch needs a commit message to describe the purpose and highlight
any limitations and other details.

======
doc/src/sgml/ref/alter_subscription.sgml

2.
+
+     <para>
+      The <literal>two_phase</literal> parameter can only be altered when the
+      subscription is disabled. When altering the parameter from
<literal>true</literal>
+      to <literal>false</literal>, the backend process checks prepared
+      transactions done by the logical replication worker and aborts them.
+     </para>

Here, the para is referring to "true" and "false" but earlier on this
page it talks about "twophase = off". IMO it is better to use a
consistent terminology like "on|off" everywhere instead of randomly
changing the way it is described each time.

======
src/backend/commands/subscriptioncmds.c

3. AlterSubscription

  if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
  {
+ List *prepared_xacts = NIL;

This 'prepared_xacts' can be declared at a lower scrope because it is
only used if (!opts.twophase).

Furthermore, IIUC you don't need to assign NIL in the declaration
because there is no chance for it to be unassigned anyway.

~~~

4. AlterSubscription

+ /*
+ * The changed two_phase option (true->false) of the
+ * slot can't be rolled back.
+ */
  PreventInTransactionBlock(isTopLevel,
    "ALTER SUBSCRIPTION ... SET (two_phase = off)");

Here is another example of inconsistent mixing of the terminology
where the comment says "true"/"false" but the message says "off".
Let's keep everything consistent. (I prefer on|off).

~~~

5.
+ if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
+ (prepared_xacts = GetGidListBySubid(subid)) != NIL)
+ {
+ ListCell *cell;
+
+ /* Abort all listed transactions */
+ foreach(cell, prepared_xacts)
+ FinishPreparedTransaction((char *) lfirst(cell),
+   false);
+
+ list_free(prepared_xacts);
+ }

5A.
IIRC there is a cleaner way to write this loop without needing
ListCell variable -- e.g. foreach_ptr() macro?

~

5B.
Shouldn't this be using list_free_deep() so the pstrdup gid gets freed too?

======
src/test/subscription/t/099_twophase_added.pl

6.
+######
+# Check the case that prepared transactions exist on subscriber node
+######
+

Give some more detailed comments here similar to the review comment of
patch v7-0002 for the other part of this TAP test.

~~~

7. TAP test - comments

Same as for my v7-0002 review comments, I think this test case also
needs a few more one-line comments to describe the sub-steps. e.g.:

# prepare a transaction to insert some rows to the table

# verify the prepared tx is replicated to the subscriber (because
'two_phase = on')

# toggle the two_phase to 'off' *before* the COMMIT PREPARED

# verify the prepared tx got aborted

# do the COMMIT PREPARED (note that now two_phase is 'off')

# verify the inserted rows got replicated ok

~~~

8. TAP test - subscription name

It's better to rename the SUBSCRIPTION in this TAP test so you can
avoid getting log warnings like:

psql:<stdin>:4: WARNING:  subscriptions created by regression test
cases should have names starting with "regress_"
psql:<stdin>:4: NOTICE:  created replication slot "sub" on publisher

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Hi, Here are some review comments for patch v7-0004

======
Commit message

1.
A detailed commit message is needed to describe the purpose and
details of this patch.

======
doc/src/sgml/ref/alter_subscription.sgml

2. CREATE SUBSCRIPTION

Shouldn't there be an entry for "force_alter" parameter in the CREATE
SUBSCRIPTION "parameters" section, instead of just vaguely mentioning
it in passing when describing the "two_phase" in ALTER SUBSCRIPTION?

~

3. ALTER SUBSCRIPTION - alterable parameters

And shouldn't this new option also be named in the ALTER SUBSCRIPTION
list: "The parameters that can be altered are..."

======
src/backend/commands/subscriptioncmds.c

4.
  XLogRecPtr lsn;
+ bool twophase_force;
 } SubOpts;

IMO this field ought to be called 'force_alter' to be the same as the
option name. Sure, now it is only relevant for 'two_phase', but that
might not always be the case in the future.

~~~

5. AlterSubscription

+ /*
+ * Abort prepared transactions if force option is also
+ * specified. Otherwise raise an ERROR.
+ */
+ if (!opts.twophase_force)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = false")));
+

5a.
/if force option is also specified/only if the 'force_alter' option is true/

~

5b.
"two_phase = false" -- IMO that should say "two_phase = off"

~

5c.
IMO this ereport should include a errhint to tell the user they can
use 'force_alter = true' to avoid getting this error.

~~~

6.

+ /* force_alter cannot be used standalone */
+ if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) &&
+ !IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("%s must be specified with %s",
+ "force_alter", "two_phase")));
+

IMO this rule is not necessary so the code should be removed. I think
using 'force_alter' standalone doesn't do anything at all (certainly,
it does no harm) so why add more complications (more rules, more code,
more tests) just for the sake of it?

======
src/test/subscription/t/099_twophase_added.pl

7.
+$node_subscriber->safe_psql('postgres',
+    "ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);");

"force" is a verb, so it is better to say 'force_alter = true' instead
of 'force_alter = on'.

~~~

8.
 $result = $node_subscriber->safe_psql('postgres',
     "SELECT count(*) FROM pg_prepared_xacts;");
 is($result, q(0), "prepared transaction done by worker is aborted");

+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE;");
+

I felt the ENABLE statement should be above the SELECT statement so
that the code is more like it was before applying the patch.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! The patch will be posted in the upcoming post.

> ======
> src/backend/access/transam/twophase.c
> 
> 1. IsTwoPhaseTransactionGidForSubid
> 
> +/*
> + * IsTwoPhaseTransactionGidForSubid
> + * Check whether the given GID is formed by TwoPhaseTransactionGid.
> + */
> +static bool
> +IsTwoPhaseTransactionGidForSubid(Oid subid, char *gid)
> 
> I think the function comment should mention something about 'subid'.
> 
> SUGGESTION
> Check whether the given GID (as formed by TwoPhaseTransactionGid) is
> for the specified 'subid'.

Fixed.

> src/backend/commands/subscriptioncmds.c
> 
> 2. AlterSubscription
> 
> + if (!opts.twophase &&
> + form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED
> &&
> + LookupGXactBySubid(subid))
> + /* Add error message */
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot disable two_phase when uncommitted prepared
> transactions present"),
> + errhint("Resolve these transactions and try again")));
> 
> The comment "/* Add error message */" seems unnecessary.

Yeah, this was an internal flag. Removed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

> ======
> Commit message
> 
> 1.
> IIUC there is quite a lot of subtlety and details about why the slot
> option needs to be changed only when altering "true" to "false", but
> not when altering "false" to "true".
> 
> It also should explain why PreventInTransactionBlock is only needed
> when altering two_phase "true" to "false".
> 
> Please include a commit message to describe all those tricky details.

Added.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 2. AlterSubscription
> 
> - PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
> (two_phase)");
> + if (!opts.twophase)
> + PreventInTransactionBlock(isTopLevel,
> +   "ALTER SUBSCRIPTION ... SET (two_phase = off)");
> 
> IMO this needs a comment to explain why PreventInTransactionBlock is
> only needed when changing the 'two_phase' option from on to off.

Added. Thoutht?

> 3. AlterSubscription
> 
> /*
> * Try to acquire the connection necessary for altering slot.
> *
> * This has to be at the end because otherwise if there is an error while
> * doing the database operations we won't be able to rollback altered
> * slot.
> */
> if (replaces[Anum_pg_subscription_subfailover - 1] ||
> replaces[Anum_pg_subscription_subtwophasestate - 1])
> {
> bool must_use_password;
> char    *err;
> WalReceiverConn *wrconn;
> bool failover_needs_to_be_updated;
> bool two_phase_needs_to_be_updated;
> 
> /* Load the library providing us libpq calls. */
> load_file("libpqwalreceiver", false);
> 
> /* Try to connect to the publisher. */
> must_use_password = sub->passwordrequired && !sub->ownersuperuser;
> wrconn = walrcv_connect(sub->conninfo, true, true, must_use_password,
> sub->name, &err);
> if (!wrconn)
> ereport(ERROR,
> (errcode(ERRCODE_CONNECTION_FAILURE),
> errmsg("could not connect to the publisher: %s", err)));
> 
> /*
> * Consider which slot option must be altered.
> *
> * We must alter the failover option whenever subfailover is updated.
> * Two_phase, however, is altered only when changing true to false.
> */
> failover_needs_to_be_updated =
> replaces[Anum_pg_subscription_subfailover - 1];
> two_phase_needs_to_be_updated =
> (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
> !opts.twophase);
> 
> PG_TRY();
> {
> if (two_phase_needs_to_be_updated || failover_needs_to_be_updated)
> walrcv_alter_slot(wrconn, sub->slotname,
>   failover_needs_to_be_updated ? &opts.failover : NULL,
>   two_phase_needs_to_be_updated ? &opts.twophase : NULL);
> }
> PG_FINALLY();
> {
> walrcv_disconnect(wrconn);
> }
> PG_END_TRY();
> }
> 
> 3a.
> The block comment "Consider which slot option must be altered..." says
> WHEN those options need to be updated, but it doesn't say WHY. e.g.
> why only update the 'two_phase' when it is being disabled but not when
> it is being enabled? In other words, I think there needs to be more
> background/reason details given in this comment.
> 
> ~~~
> 
> 3b.
> Can't those 2 new variable assignments be done up-front and guard this
> entire "if-block" instead of the current replaces[] guarding it? Then
> the code is somewhat simplified.
> 
> SUGGESTION:
> /*
>  * <improved comment here to explain these variables>
>  */
> update_failover = replaces[Anum_pg_subscription_subfailover - 1];
> update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate -
> 1] && !opts.twophase);
> 
> /*
>  * Try to acquire the connection necessary for altering slot.
>  *
>  * This has to be at the end because otherwise if there is an error while
>  * doing the database operations we won't be able to rollback altered
>  * slot.
>  */
> if (update_failover || update_two_phase)
> {
>   ...
> 
>   /* Load the library providing us libpq calls. */
>   load_file("libpqwalreceiver", false);
> 
>   /* Try to connect to the publisher. */
>   must_use_password = sub->passwordrequired && !sub->ownersuperuser;
>   wrconn = walrcv_connect(sub->conninfo, true, true,
> must_use_password, sub->name, &err);
>   if (!wrconn)
>     ereport(ERROR, ...);
> 
>   PG_TRY();
>   {
>     walrcv_alter_slot(wrconn, sub->slotname,
>       update_failover ? &opts.failover : NULL,
>       update_two_phase ? &opts.twophase : NULL);
>   }
>   PG_FINALLY();
>   {
>     walrcv_disconnect(wrconn);
>   }
>   PG_END_TRY();
> }

Two variables were added and comments were updated.

> .../libpqwalreceiver/libpqwalreceiver.c
> 
> 4.
> + appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
> + quote_identifier(slotname));
> +
> + if (failover)
> + appendStringInfo(&cmd, "FAILOVER %s ",
> + (*failover) ? "true" : "false");
> +
> + if (two_phase)
> + appendStringInfo(&cmd, "TWO_PHASE %s%s ",
> + (*two_phase) ? "true" : "false",
> + failover ? ", " : "");
> +
> + appendStringInfoString(&cmd, ");");
> 
> 4a.
> IIUC the comma logic here was broken in v7 when you swapped the order.
> Anyway, IMO it will be better NOT to try combining that comma logic
> with the existing appendStringInfo. Doing it separately is both easier
> and less error-prone.
> 
> Furthermore, the parentheses like "(*two_phase)" instead of just
> "*two_phase" seemed a bit overkill.
> 
> SUGGESTION:
> + if (failover)
> + appendStringInfo(&cmd, "FAILOVER %s",
> + *failover ? "true" : "false");
> +
> +   if (failover && two_phase)
> +       appendStringInfo(&cmd, ", ");
> +
> + if (two_phase)
> + appendStringInfo(&cmd, "TWO_PHASE %s",
> + *two_phase ? "true" : "false");
> +
> + appendStringInfoString(&cmd, " );");

Fixed.

> 4b.
> Like I said above, IMO the current separator logic in v7 is broken. So
> it is a bit concerning the tests all passed anyway. How did that
> happen? I think this indicates that there needs to be an additional
> test scenario where both 'failover' and 'two_phase' get altered at the
> same time so this code gets exercised properly.

Right, it was added.

> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 5.
> +# Define pre-existing tables on both nodes
> 
> Why say they are "pre-existing"? They are not pre-existing because you
> are creating them right here!

Removed the word.

> 6.
> +######
> +# Check the case that prepared transactions exist on publisher node
> +######
> 
> I think this needs a slightly more detailed comment.
> 
> SUGGESTION (this is just an example, but you can surely improve it)
> 
> # Check the case that prepared transactions exist on the publisher node.
> #
> # Since two_phase is "off", then normally this PREPARE will do nothing until
> # the COMMIT PREPARED, but in this test, we toggle the two_phase to "on" again
> # before the COMMIT PREPARED happens.

Changed with adjustments.

> 7.
> Maybe this test case needs a few more one-line comments for each of
> the sub-steps. e.g.:
> 
> # prepare a transaction to insert some rows to the table
> 
> # verify the prepared tx is not yet replicated to the subscriber
> (because 'two_phase = off')
> 
> # toggle the two_phase to 'on' *before* the COMMIT PREPARED
> 
> # verify the inserted rows got replicated ok

Modified like yours, but changed based on the suggestion by Grammarly.

> 8.
> IIUC this test will behave the same even if you DON'T do the toggle
> 'two_phase = on'. So I wonder is there something more you can do to
> test this scenario more convincingly?

I found an indicator. When the apply starts, it outputs the current status of
two_phase option. I added wait_for_log() to ensure below appeared. Thought?

```
    ereport(DEBUG1,
            (errmsg_internal("logical replication apply worker for subscription \"%s\" two_phase is %s",
                             MySubscription->name,
                             MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_DISABLED ? "DISABLED" :
                             MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING ? "PENDING" :
                             MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED ? "ENABLED" :
                             "?")));
```
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

> Commit Message
> 
> 1.
> The patch needs a commit message to describe the purpose and highlight
> any limitations and other details.

Added.

> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.
> +
> +     <para>
> +      The <literal>two_phase</literal> parameter can only be altered when
> the
> +      subscription is disabled. When altering the parameter from
> <literal>true</literal>
> +      to <literal>false</literal>, the backend process checks prepared
> +      transactions done by the logical replication worker and aborts them.
> +     </para>
> 
> Here, the para is referring to "true" and "false" but earlier on this
> page it talks about "twophase = off". IMO it is better to use a
> consistent terminology like "on|off" everywhere instead of randomly
> changing the way it is described each time.

I checked contents and changed to "on|off".

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3. AlterSubscription
> 
>   if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
>   {
> + List *prepared_xacts = NIL;
> 
> This 'prepared_xacts' can be declared at a lower scrope because it is
> only used if (!opts.twophase).
> 
> Furthermore, IIUC you don't need to assign NIL in the declaration
> because there is no chance for it to be unassigned anyway.

Made the namespace narrower and initialization was removed.

> ~~~
> 
> 4. AlterSubscription
> 
> + /*
> + * The changed two_phase option (true->false) of the
> + * slot can't be rolled back.
> + */
>   PreventInTransactionBlock(isTopLevel,
>     "ALTER SUBSCRIPTION ... SET (two_phase = off)");
> 
> Here is another example of inconsistent mixing of the terminology
> where the comment says "true"/"false" but the message says "off".
> Let's keep everything consistent. (I prefer on|off).

Modified.

> ~~~
> 
> 5.
> + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
> + (prepared_xacts = GetGidListBySubid(subid)) != NIL)
> + {
> + ListCell *cell;
> +
> + /* Abort all listed transactions */
> + foreach(cell, prepared_xacts)
> + FinishPreparedTransaction((char *) lfirst(cell),
> +   false);
> +
> + list_free(prepared_xacts);
> + }
> 
> 5A.
> IIRC there is a cleaner way to write this loop without needing
> ListCell variable -- e.g. foreach_ptr() macro?

Changed.

> 5B.
> Shouldn't this be using list_free_deep() so the pstrdup gid gets freed too?

Yeah, fixed.

> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 6.
> +######
> +# Check the case that prepared transactions exist on subscriber node
> +######
> +
> 
> Give some more detailed comments here similar to the review comment of
> patch v7-0002 for the other part of this TAP test.
> 
> ~~~
> 
> 7. TAP test - comments
> 
> Same as for my v7-0002 review comments, I think this test case also
> needs a few more one-line comments to describe the sub-steps. e.g.:
> 
> # prepare a transaction to insert some rows to the table
> 
> # verify the prepared tx is replicated to the subscriber (because
> 'two_phase = on')
> 
> # toggle the two_phase to 'off' *before* the COMMIT PREPARED
> 
> # verify the prepared tx got aborted
> 
> # do the COMMIT PREPARED (note that now two_phase is 'off')
> 
> # verify the inserted rows got replicated ok

They were fixed based on your previous comments.

> 
> 8. TAP test - subscription name
> 
> It's better to rename the SUBSCRIPTION in this TAP test so you can
> avoid getting log warnings like:
> 
> psql:<stdin>:4: WARNING:  subscriptions created by regression test
> cases should have names starting with "regress_"
> psql:<stdin>:4: NOTICE:  created replication slot "sub" on publisher

Modified, but it was included in 0001.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

> 
> ======
> Commit message
> 
> 1.
> A detailed commit message is needed to describe the purpose and
> details of this patch.

Added.

> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2. CREATE SUBSCRIPTION
> 
> Shouldn't there be an entry for "force_alter" parameter in the CREATE
> SUBSCRIPTION "parameters" section, instead of just vaguely mentioning
> it in passing when describing the "two_phase" in ALTER SUBSCRIPTION?
>
> 3. ALTER SUBSCRIPTION - alterable parameters
> 
> And shouldn't this new option also be named in the ALTER SUBSCRIPTION
> list: "The parameters that can be altered are..."

Hmm, but the parameter cannot be used for CREATE SUBSCRIPTION. Should we
modify to accept and add the description in the doc? This was not accepted.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 4.
>   XLogRecPtr lsn;
> + bool twophase_force;
>  } SubOpts;
> 
> IMO this field ought to be called 'force_alter' to be the same as the
> option name. Sure, now it is only relevant for 'two_phase', but that
> might not always be the case in the future.

Modified.

> 5. AlterSubscription
> 
> + /*
> + * Abort prepared transactions if force option is also
> + * specified. Otherwise raise an ERROR.
> + */
> + if (!opts.twophase_force)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot alter %s when there are prepared transactions",
> + "two_phase = false")));
> +
> 
> 5a.
> /if force option is also specified/only if the 'force_alter' option is true/

Modified.

> 
> 5b.
> "two_phase = false" -- IMO that should say "two_phase = off"

Modified.

> 5c.
> IMO this ereport should include a errhint to tell the user they can
> use 'force_alter = true' to avoid getting this error.

Hint was added.

> 6.
> 
> + /* force_alter cannot be used standalone */
> + if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) &&
> + !IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("%s must be specified with %s",
> + "force_alter", "two_phase")));
> +
> 
> IMO this rule is not necessary so the code should be removed. I think
> using 'force_alter' standalone doesn't do anything at all (certainly,
> it does no harm) so why add more complications (more rules, more code,
> more tests) just for the sake of it?

Removed. So standalone 'force_alter' is now no-op.

> src/test/subscription/t/099_twophase_added.pl
> 
> 7.
> +$node_subscriber->safe_psql('postgres',
> +    "ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);");
> 
> "force" is a verb, so it is better to say 'force_alter = true' instead
> of 'force_alter = on'.

Fixed. Actually not sure it is better because I'm not a native.

> 8.
>  $result = $node_subscriber->safe_psql('postgres',
>      "SELECT count(*) FROM pg_prepared_xacts;");
>  is($result, q(0), "prepared transaction done by worker is aborted");
> 
> +$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub
> ENABLE;");
> +
> 
> I felt the ENABLE statement should be above the SELECT statement so
> that the code is more like it was before applying the patch.

Fixed.

Please see attached patch set.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment
Hi, Here are some review comments for v8-0002.

======
Commit message

1.
Regarding the off->on case, the logical replication already has a mechanism for
it, so there is no need to do anything special for the on->off case; however,
we must connect to the publisher and expressly change the parameter. The
operation cannot be rolled back, and altering the parameter from "on" to "off"
within a transaction is prohibited.

~

I think the difference between "off"-to"on" and "on"-to"off" needs to
be explained in more detail. Specifically "already has a mechanism for
it" seems very vague.

======
src/backend/commands/subscriptioncmds.c

2.
  /*
- * The changed two_phase option of the slot can't be rolled
- * back.
+ * Since the altering two_phase option of subscriptions
+ * also leads to the change of slot option, this command
+ * cannot be rolled back. So prevent we are in the
+ * transaction block.
  */
- PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
(two_phase)");
+ if (!opts.twophase)
+ PreventInTransactionBlock(isTopLevel,
+

2a.
There is a typo: "So prevent we are".

SUGGESTION (minor adjustment and typo fix)
Since altering the two_phase option of subscriptions also leads to
changing the slot option, this command cannot be rolled back. So
prevent this if we are in a transaction block.

~

2b.
But, in my previous review [v7-0002#3] I asked if the comment could
explain why this check is only needed for two_phase "on"-to-"off" but
not for "off"-to-"on". That explanation/reason is still not yet given
in the latest comment.

~~~

3.
  /*
- * Try to acquire the connection necessary for altering slot.
+ * Check the need to alter the replication slot. Failover and two_phase
+ * options are controlled by both the publisher (as a slot option) and the
+ * subscriber (as a subscription option).
+ */
+ update_failover = replaces[Anum_pg_subscription_subfailover - 1];
+ update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
+ !opts.twophase);


(This is similar to the previous comment). In my previous review
[v7-0002#3a] I asked why update_two_phase is TRUE only if 'two-phase'
is being updated "on"-to-"off", but not when it is being updated
"off"-to-"on". That explanation/reason is still not yet given in the
latest comment.

======
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c

4.
- appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s,
TWO_PHASE %s )",
- quote_identifier(slotname),
- failover ? "true" : "false",
- two_phase ? "true" : "false");
+ appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
+ quote_identifier(slotname));
+
+ if (failover)
+ appendStringInfo(&cmd, "FAILOVER %s",
+ *failover ? "true" : "false");
+
+ if (failover && two_phase)
+ appendStringInfo(&cmd, ", ");
+
+ if (two_phase)
+ appendStringInfo(&cmd, "TWO_PHASE %s",
+ *two_phase ? "true" : "false");
+
+ appendStringInfoString(&cmd, ");");

It will be better if that last line includes the extra space like I
had suggested in [v7-0002#4a] so the result will have the same spacing
as in the original code. e.g.

+ appendStringInfoString(&cmd, " );");

======
src/test/subscription/t/099_twophase_added.pl

5.
+# Check the case that prepared transactions exist on the publisher node.
+#
+# Since the two_phase is "off", then normally, this PREPARE will do nothing
+# until the COMMIT PREPARED, but in this test, we toggle the two_phase to "on"
+# again before the COMMIT PREPARED happens.

This is a major test part so IMO this comment should have
##################### like it had before, to distinguish it from all
the sub-step comments.

======

My v7-0002 review -
https://www.postgresql.org/message-id/CAHut%2BPtu_w_UCGR-5DbenA%2By7wRiA8QPi_ZP%3DCCJ3SGdTn-%3D%3Dg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.



Hi, Here are some review comments for v8-0003.

======
src/sgml/ref/alter_subscription.sgml

1.
+     <para>
+      The <literal>two_phase</literal> parameter can only be altered when the
+      subscription is disabled. When altering the parameter from
<literal>on</literal>
+      to <literal>off</literal>, the backend process checks prepared
+      transactions done by the logical replication worker and aborts them.
+     </para>

The text may be OK as-is, but I was wondering if it might be better to
give a more verbose explanation.

BEFORE
... the backend process checks prepared transactions done by the
logical replication worker and aborts them.

SUGGESTION
... the backend process checks for any incomplete prepared
transactions done by the logical replication worker (from when
two_phase parameter was still "on") and, if any are found, those are
aborted.

======
src/backend/commands/subscriptioncmds.c

2. AlterSubscription

- /*
- * Since the altering two_phase option of subscriptions
- * also leads to the change of slot option, this command
- * cannot be rolled back. So prevent we are in the
- * transaction block.
+ * If two_phase was enabled, there is a possibility the
+ * transactions has already been PREPARE'd. They must be
+ * checked and rolled back.
  */

BEFORE
... there is a possibility the transactions has already been PREPARE'd.

SUGGESTION
... there is a possibility that transactions have already been PREPARE'd.

~~~

3. AlterSubscription
+ /*
+ * Since the altering two_phase option of subscriptions
+ * (especially on->off case) also leads to the
+ * change of slot option, this command cannot be rolled
+ * back. So prevent we are in the transaction block.
+ */
  PreventInTransactionBlock(isTopLevel,
    "ALTER SUBSCRIPTION ... SET (two_phase = off)");


This comment is a bit vague and includes some typos, but IIUC these
problems will already be addressed by the 0002 patch changes.AFAIK
patch 0003 is only moving the 0002 comment.

======
src/test/subscription/t/099_twophase_added.pl

4.
+# Check the case that prepared transactions exist on the subscriber node
+#
+# If the two_phase is altering from "on" to "off" and there are prepared
+# transactions on the subscriber, they must be aborted. This test checks it.
+

Similar to the comment that I gave for v8-0002. I think there should
be #################### comment for the major test comment to
distinguish it from comments for the sub-steps.

~~~

5.
+# Verify the prepared transaction are aborted because two_phase is changed to
+# "off".
+$result = $node_subscriber->safe_psql('postgres',
+    "SELECT count(*) FROM pg_prepared_xacts;");
+is($result, q(0), "prepared transaction done by worker is aborted");
+

/the prepared transaction are aborted/any prepared transactions are aborted/

======
Kind Regards,
Peter Smith
Fujitsu Australia



Hi, Here are some comments for v8-0004

======
0.1 General - Patch name

/SUBSCIRPTION/SUBSCRIPTION/

======
0.2 General - Apply

FYI, there are whitespace warnings:

git apply ../patches_misc/v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTION-.-S.patch
../patches_misc/v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTION-.-S.patch:191:
trailing whitespace.
# command will abort the prepared transaction and succeed.
warning: 1 line adds whitespace errors.

======
0.3 General - Regression test fails

The subscription regression tests are not working.

ok 158       + publication                              1187 ms
not ok 159   + subscription                              123 ms

See review comments #4 and #5 below for the reason why.

======
src/sgml/ref/alter_subscription.sgml

1.
      <para>
       The <literal>two_phase</literal> parameter can only be altered when the
-      subscription is disabled. When altering the parameter from
<literal>on</literal>
-      to <literal>off</literal>, the backend process checks prepared
-      transactions done by the logical replication worker and aborts them.
+      subscription is disabled. Altering the parameter from
<literal>on</literal>
+      to <literal>off</literal> will be failed when there are prepared
+      transactions done by the logical replication worker. If you want to alter
+      the parameter forcibly in this case, <literal>force_alter</literal>
+      option must be set to <literal>on</literal> at the same time. If
+      specified, the backend process aborts prepared transactions.
      </para>
1a.
That "will be failed when..." seems strange. Maybe say "will give an
error when..."

~
1b.
Because "force" is a verb, I think true/false is more natural than
on/off for this new boolean option. e.g. it acts more like a "flag"
than a "mode". See all the other boolean options in CREATE
SUBSCRIPTION -- those are mostly all verbs too and are all true/false
AFAIK.

======

2. CREATE SUBSCRIPTION

For my previous review, two comments [v7-0004#2] and [v7-0004#3] were
not addressed. Kuroda-san wrote:
Hmm, but the parameter cannot be used for CREATE SUBSCRIPTION. Should
we modify to accept and add the description in the doc?

~

Yes, that is what I am suggesting. IMO it is odd for the user to be
able to ALTER a parameter that cannot be included in the CREATE
SUBSCRIPTION in the first place. AFAIK there are no other parameters
that behave that way.

======
src/backend/commands/subscriptioncmds.c

3. AlterSubscription

+ if (!opts.force_alter)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = off"),
+ errhint("Resolve these transactions or set %s at the same time, and
then try again.",
+ "force_alter = true")));

I think saying "at the same time" in the hint is unnecessary. Surely
the user is allowed to set this parameter separately if they want to?

e.g.
ALTER SUBSCRIPTION sub SET (force_alter=true);
ALTER SUBSCRIPTION sub SET (two_phase=off);

======
src/test/regress/expected/subscription.out

4.
+-- fail - force_alter cannot be set alone
+ALTER SUBSCRIPTION regress_testsub SET (force_alter = true);
+ERROR:  force_alter must be specified with two_phase

This error cannot happen. You removed that error!

======
src/test/regress/sql/subscription.sql

5.
+-- fail - force_alter cannot be set alone
+ALTER SUBSCRIPTION regress_testsub SET (force_alter = true);

Why is this being tested? You removed that error condition.

======
src/test/subscription/t/099_twophase_added.pl

6.
+# Try altering the two_phase option to "off." The command will fail since there
+# is a prepared transaction and the force option is not specified.
+my $stdout;
+my $stderr;
+
+($result, $stdout, $stderr) = $node_subscriber->psql(
+ 'postgres', "ALTER SUBSCRIPTION regress_sub SET (two_phase = off);");
+ok($stderr =~ /cannot alter two_phase = off when there are prepared
transactions/,
+ 'ALTER SUBSCRIPTION failed');

/force option is not specified./'force_alter' option is not specified as true./

~~~

7.
+# Verify the prepared transaction still exists
+$result = $node_subscriber->safe_psql('postgres',
+    "SELECT count(*) FROM pg_prepared_xacts;");
+is($result, q(1), "prepared transaction still exits");
+

TYPO: /exits/exists/

~~~

8.
+# Alter the two_phase with the force_alter option. Apart from the above, the
+# command will abort the prepared transaction and succeed.
+$node_subscriber->safe_psql('postgres',
+    "ALTER SUBSCRIPTION regress_sub SET (two_phase = off, force_alter
= true);");
+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION
regress_sub ENABLE;");
+

What does "Apart from the above" mean? Be more explicit.

~~~

9.
+# Verify the prepared transaction are aborted
 $result = $node_subscriber->safe_psql('postgres',
     "SELECT count(*) FROM pg_prepared_xacts;");
 is($result, q(0), "prepared transaction done by worker is aborted");

/transaction are aborted/transaction was aborted/

======
Response to my v7-0004 review --
https://www.postgresql.org/message-id/OSBPR01MB2552F738ACF1DA6838025C4FF5E62%40OSBPR01MB2552.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for giving comments! I attached updated version.

> 1.
> Regarding the off->on case, the logical replication already has a mechanism for
> it, so there is no need to do anything special for the on->off case; however,
> we must connect to the publisher and expressly change the parameter. The
> operation cannot be rolled back, and altering the parameter from "on" to "off"
> within a transaction is prohibited.
> 
> ~
> 
> I think the difference between "off"-to"on" and "on"-to"off" needs to
> be explained in more detail. Specifically "already has a mechanism for
> it" seems very vague.

New paragraph was added.

> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 2.
>   /*
> - * The changed two_phase option of the slot can't be rolled
> - * back.
> + * Since the altering two_phase option of subscriptions
> + * also leads to the change of slot option, this command
> + * cannot be rolled back. So prevent we are in the
> + * transaction block.
>   */
> - PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
> (two_phase)");
> + if (!opts.twophase)
> + PreventInTransactionBlock(isTopLevel,
> +
> 
> 2a.
> There is a typo: "So prevent we are".
> 
> SUGGESTION (minor adjustment and typo fix)
> Since altering the two_phase option of subscriptions also leads to
> changing the slot option, this command cannot be rolled back. So
> prevent this if we are in a transaction block.

Fixed.

> 2b.
> But, in my previous review [v7-0002#3] I asked if the comment could
> explain why this check is only needed for two_phase "on"-to-"off" but
> not for "off"-to-"on". That explanation/reason is still not yet given
> in the latest comment.

Added.

> 3.
>   /*
> - * Try to acquire the connection necessary for altering slot.
> + * Check the need to alter the replication slot. Failover and two_phase
> + * options are controlled by both the publisher (as a slot option) and the
> + * subscriber (as a subscription option).
> + */
> + update_failover = replaces[Anum_pg_subscription_subfailover - 1];
> + update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1]
> &&
> + !opts.twophase);
> 
> 
> (This is similar to the previous comment). In my previous review
> [v7-0002#3a] I asked why update_two_phase is TRUE only if 'two-phase'
> is being updated "on"-to-"off", but not when it is being updated
> "off"-to-"on". That explanation/reason is still not yet given in the
> latest comment.

Added.

> 
> ======
> src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
> 
> 4.
> - appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s,
> TWO_PHASE %s )",
> - quote_identifier(slotname),
> - failover ? "true" : "false",
> - two_phase ? "true" : "false");
> + appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
> + quote_identifier(slotname));
> +
> + if (failover)
> + appendStringInfo(&cmd, "FAILOVER %s",
> + *failover ? "true" : "false");
> +
> + if (failover && two_phase)
> + appendStringInfo(&cmd, ", ");
> +
> + if (two_phase)
> + appendStringInfo(&cmd, "TWO_PHASE %s",
> + *two_phase ? "true" : "false");
> +
> + appendStringInfoString(&cmd, ");");
> 
> It will be better if that last line includes the extra space like I
> had suggested in [v7-0002#4a] so the result will have the same spacing
> as in the original code. e.g.
> 
> + appendStringInfoString(&cmd, " );");

I missed the blank, added.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 5.
> +# Check the case that prepared transactions exist on the publisher node.
> +#
> +# Since the two_phase is "off", then normally, this PREPARE will do nothing
> +# until the COMMIT PREPARED, but in this test, we toggle the two_phase to "on"
> +# again before the COMMIT PREPARED happens.
> 
> This is a major test part so IMO this comment should have
> ##################### like it had before, to distinguish it from all
> the sub-step comments.

Added.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! Patch can be available in [1].

> ======
> src/sgml/ref/alter_subscription.sgml
> 
> 1.
> +     <para>
> +      The <literal>two_phase</literal> parameter can only be altered when
> the
> +      subscription is disabled. When altering the parameter from
> <literal>on</literal>
> +      to <literal>off</literal>, the backend process checks prepared
> +      transactions done by the logical replication worker and aborts them.
> +     </para>
> 
> The text may be OK as-is, but I was wondering if it might be better to
> give a more verbose explanation.
> 
> BEFORE
> ... the backend process checks prepared transactions done by the
> logical replication worker and aborts them.
> 
> SUGGESTION
> ... the backend process checks for any incomplete prepared
> transactions done by the logical replication worker (from when
> two_phase parameter was still "on") and, if any are found, those are
> aborted.
>

Fixed.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 2. AlterSubscription
> 
> - /*
> - * Since the altering two_phase option of subscriptions
> - * also leads to the change of slot option, this command
> - * cannot be rolled back. So prevent we are in the
> - * transaction block.
> + * If two_phase was enabled, there is a possibility the
> + * transactions has already been PREPARE'd. They must be
> + * checked and rolled back.
>   */
> 
> BEFORE
> ... there is a possibility the transactions has already been PREPARE'd.
> 
> SUGGESTION
> ... there is a possibility that transactions have already been PREPARE'd.

Fixed.

> 3. AlterSubscription
> + /*
> + * Since the altering two_phase option of subscriptions
> + * (especially on->off case) also leads to the
> + * change of slot option, this command cannot be rolled
> + * back. So prevent we are in the transaction block.
> + */
>   PreventInTransactionBlock(isTopLevel,
>     "ALTER SUBSCRIPTION ... SET (two_phase = off)");
> 
> 
> This comment is a bit vague and includes some typos, but IIUC these
> problems will already be addressed by the 0002 patch changes.AFAIK
> patch 0003 is only moving the 0002 comment.

Yeah, the comment was updated accordingly.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 4.
> +# Check the case that prepared transactions exist on the subscriber node
> +#
> +# If the two_phase is altering from "on" to "off" and there are prepared
> +# transactions on the subscriber, they must be aborted. This test checks it.
> +
> 
> Similar to the comment that I gave for v8-0002. I think there should
> be #################### comment for the major test comment to
> distinguish it from comments for the sub-steps.

Added.

> 5.
> +# Verify the prepared transaction are aborted because two_phase is changed to
> +# "off".
> +$result = $node_subscriber->safe_psql('postgres',
> +    "SELECT count(*) FROM pg_prepared_xacts;");
> +is($result, q(0), "prepared transaction done by worker is aborted");
> +
> 
> /the prepared transaction are aborted/any prepared transactions are aborted/

Fixed.

[1]:
https://www.postgresql.org/message-id/OSBPR01MB2552FEA48D265EA278AA9F7AF5E22%40OSBPR01MB2552.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for giving comments! New patch was posted in [1].

> 0.1 General - Patch name
> 
> /SUBSCIRPTION/SUBSCRIPTION/

Fixed.

> ======
> 0.2 General - Apply
> 
> FYI, there are whitespace warnings:
> 
> git
> apply ../patches_misc/v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTI
> ON-.-S.patch
> ../patches_misc/v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTION-.-
> S.patch:191:
> trailing whitespace.
> # command will abort the prepared transaction and succeed.
> warning: 1 line adds whitespace errors.

I didn't recognize, fixed.

> ======
> 0.3 General - Regression test fails
> 
> The subscription regression tests are not working.
> 
> ok 158       + publication                              1187 ms
> not ok 159   + subscription                              123 ms
> 
> See review comments #4 and #5 below for the reason why.

Yeah, I missed to update the expected result. Fixed.

> ======
> src/sgml/ref/alter_subscription.sgml
> 
> 1.
>       <para>
>        The <literal>two_phase</literal> parameter can only be altered when
> the
> -      subscription is disabled. When altering the parameter from
> <literal>on</literal>
> -      to <literal>off</literal>, the backend process checks prepared
> -      transactions done by the logical replication worker and aborts them.
> +      subscription is disabled. Altering the parameter from
> <literal>on</literal>
> +      to <literal>off</literal> will be failed when there are prepared
> +      transactions done by the logical replication worker. If you want to alter
> +      the parameter forcibly in this case, <literal>force_alter</literal>
> +      option must be set to <literal>on</literal> at the same time. If
> +      specified, the backend process aborts prepared transactions.
>       </para>
> 1a.
> That "will be failed when..." seems strange. Maybe say "will give an
> error when..."
> 
> ~
> 1b.
> Because "force" is a verb, I think true/false is more natural than
> on/off for this new boolean option. e.g. it acts more like a "flag"
> than a "mode". See all the other boolean options in CREATE
> SUBSCRIPTION -- those are mostly all verbs too and are all true/false
> AFAIK.

Fixed, but note that the part was moved.

> 
> ======
> 
> 2. CREATE SUBSCRIPTION
> 
> For my previous review, two comments [v7-0004#2] and [v7-0004#3] were
> not addressed. Kuroda-san wrote:
> Hmm, but the parameter cannot be used for CREATE SUBSCRIPTION. Should
> we modify to accept and add the description in the doc?
> 
> ~
> 
> Yes, that is what I am suggesting. IMO it is odd for the user to be
> able to ALTER a parameter that cannot be included in the CREATE
> SUBSCRIPTION in the first place. AFAIK there are no other parameters
> that behave that way.

Hmm. I felt that this change required the new attribute in pg_subscription system
catalog. Previously I did not like because it contains huge change, but...I tried to do.
New attribute 'subforcealter', and some parts were updated accordingly.

> src/backend/commands/subscriptioncmds.c
> 
> 3. AlterSubscription
> 
> + if (!opts.force_alter)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot alter %s when there are prepared transactions",
> + "two_phase = off"),
> + errhint("Resolve these transactions or set %s at the same time, and
> then try again.",
> + "force_alter = true")));
> 
> I think saying "at the same time" in the hint is unnecessary. Surely
> the user is allowed to set this parameter separately if they want to?
> 
> e.g.
> ALTER SUBSCRIPTION sub SET (force_alter=true);
> ALTER SUBSCRIPTION sub SET (two_phase=off);

Actually, it was correct. Since force_alter was not recorded in the system catalog, it must
be specified at the same time.
Now, we allow to be separated, so removed.

> ======
> src/test/regress/expected/subscription.out
> 
> 4.
> +-- fail - force_alter cannot be set alone
> +ALTER SUBSCRIPTION regress_testsub SET (force_alter = true);
> +ERROR:  force_alter must be specified with two_phase
> 
> This error cannot happen. You removed that error!

Fixed.

> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 6.
> +# Try altering the two_phase option to "off." The command will fail since there
> +# is a prepared transaction and the force option is not specified.
> +my $stdout;
> +my $stderr;
> +
> +($result, $stdout, $stderr) = $node_subscriber->psql(
> + 'postgres', "ALTER SUBSCRIPTION regress_sub SET (two_phase = off);");
> +ok($stderr =~ /cannot alter two_phase = off when there are prepared
> transactions/,
> + 'ALTER SUBSCRIPTION failed');
> 
> /force option is not specified./'force_alter' option is not specified as true./

Fixed.

> 
> 7.
> +# Verify the prepared transaction still exists
> +$result = $node_subscriber->safe_psql('postgres',
> +    "SELECT count(*) FROM pg_prepared_xacts;");
> +is($result, q(1), "prepared transaction still exits");
> +
> 
> TYPO: /exits/exists/

Fixed.

> 
> ~~~
> 
> 8.
> +# Alter the two_phase with the force_alter option. Apart from the above, the
> +# command will abort the prepared transaction and succeed.
> +$node_subscriber->safe_psql('postgres',
> +    "ALTER SUBSCRIPTION regress_sub SET (two_phase = off, force_alter
> = true);");
> +$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION
> regress_sub ENABLE;");
> +
> 
> What does "Apart from the above" mean? Be more explicit.

Clarified like "Apart from the last ALTER SUBSCRIPTION command...".

> 9.
> +# Verify the prepared transaction are aborted
>  $result = $node_subscriber->safe_psql('postgres',
>      "SELECT count(*) FROM pg_prepared_xacts;");
>  is($result, q(0), "prepared transaction done by worker is aborted");
> 
> /transaction are aborted/transaction was aborted/

Fixed.

[1]:
https://www.postgresql.org/message-id/OSBPR01MB2552FEA48D265EA278AA9F7AF5E22%40OSBPR01MB2552.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Hi Kuroda-san,

I'm having second thoughts about how these patches mention the option
values "on|off". These are used in the ALTER SUBSCRIPTION document
page for 'two_phase' and 'failover' parameters, and then those
"on|off" get propagated to the code comments, error messages, and
tests...

Now I see that on the CREATE SUBSCRIPTION page [1], every boolean
parameter (even including 'two_phase' and 'failover') is described in
terms of "true|false" (not "on|off").

In hindsight, it is probably better to refer only to true|false
everywhere for these boolean parameters, instead of sometimes using
different values like on|off.

What do you think?

======
[1] https://www.postgresql.org/docs/devel/sql-createsubscription.html

Kind Regards,
Peter Smith.
Fujitsu Australia



Hi Kuroda-san, Here are some review comments for all patches v9*

//////////
Patch v9-0001
//////////

There were no changes since v8-0001, so no comments.

//////////
Patch v9-0002
//////////

======
Commit Message

2.1.
Regarding the off->on case, the logical replication already has a
mechanism for it, so there is no need to do anything special for the
on->off case; however, we must connect to the publisher and expressly
change the parameter. The operation cannot be rolled back, and
altering the parameter from "on" to "off" within a transaction is
prohibited.

In the opposite case, there is no need to prevent this because the
logical replication worker already had the mechanism to alter the slot
option at a convenient time.

~

This explanation seems to be going around in circles, without giving
any new information:

AFAICT, "Regarding the off->on case, the logical replication already
has a mechanism for it, so there is no need to do anything special for
the on->off case;"

is saying pretty much the same as:

"In the opposite case, there is no need to prevent this because the
logical replication worker already had the mechanism to alter the slot
option at a convenient time."

But, what I hoped for in previous review comments was an explanation
somewhat less vague than "already has a mechanism" or "already had the
mechanism". Can't this have just 1 or 2 lines to say WHAT is that
existing mechanism for the "off" to "on" case, and WHY that means
there is nothing special to do in that scenario?

======
src/backend/commands/subscriptioncmds.c

2.2. AlterSubscription

  /*
- * The changed two_phase option of the slot can't be rolled
- * back.
+ * Since altering the two_phase option of subscriptions
+ * also leads to changing the slot option, this command
+ * cannot be rolled back. So prevent this if we are in a
+ * transaction block. In the opposite case, there is no
+ * need to prevent this because the logical replication
+ * worker already had the mechanism to alter the slot
+ * option at a convenient time.
  */

(Same previous review comments, and same as my review comment for the
commit message above).

I don't think "already had the mechanism" is enough explanation.

Also, the 2nd sentence doesn't make sense here because the comment
only said "altering the slot option" -- it didn't say it was altering
it to "on" or altering it to "off", so "the opposite case" has no
meaning.

~~~

2.3. AlterSubscription

  /*
- * Try to acquire the connection necessary for altering slot.
+ * Check the need to alter the replication slot. Failover and two_phase
+ * options are controlled by both the publisher (as a slot option) and the
+ * subscriber (as a subscription option). The slot option must be altered
+ * only when changing "on" to "off". Because in opposite case, the logical
+ * replication worker already has the mechanism to do so at a convenient
+ * time.
+ */
+ update_failover = replaces[Anum_pg_subscription_subfailover - 1];
+ update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
+ !opts.twophase);

This is again the same as other review comments above. Probably, when
some better explanation can be found for "already has the mechanism to
do so at a convenient time." then all of these places can be changed
using similar text.

//////////
Patch v9-0003
//////////

There are some imperfect code comments but AFAIK they are the same
ones from patch 0002. I think patch 0003 is just moving those comments
to different places, so probably they would already be addressed by
patch 0002.

//////////
Patch v9-0004
//////////

======
doc/src/sgml/catalogs.sgml

4.1.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>subforcealter</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the subscription can be altered <literal>two_phase</literal>
+       option, even if there are prepared transactions
+      </para></entry>
+     </row>
+

BEFORE
If true, the subscription can be altered <literal>two_phase</literal>
option, even if there are prepared transactions

SUGGESTION
If true, then the ALTER SUBSCRIPTION command can disable
<literal>two_phase</literal> option, even if there are uncommitted
prepared transactions from when <literal>two_phase</literal> was
enabled

======
doc/src/sgml/ref/alter_subscription.sgml

4.2.
-
-     <para>
-      The <literal>two_phase</literal> parameter can only be altered when the
-      subscription is disabled. When altering the parameter from
<literal>on</literal>
-      to <literal>off</literal>, the backend process checks for any incomplete
-      prepared transactions done by the logical replication worker (from when
-      <literal>two_phase</literal> parameter was still <literal>on</literal>)
-      and, if any are found, those are aborted.
-     </para>

Well, I still think there ought to be some mention of the relationship
between 'force_alter' and 'two_phase' given on this ALTER SUBSCRIPTION
page. Then the user can cross-reference to read what the 'force_alter'
actually does.

======
doc/src/sgml/ref/create_subscription.sgml

4.3.
+
+       <varlistentry id="sql-createsubscription-params-with-force-alter">
+        <term><literal>force_alter</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          Specifies whether the subscription can be altered
+          <literal>two_phase</literal> option, even if there are prepared
+          transactions. If specified, the backend process checks for any
+          incomplete prepared transactions done by the logical replication
+          worker (from when <literal>two_phase</literal> parameter was still
+          <literal>on</literal>), if any are found, those are aborted.
+          Otherwise, Altering the parameter from <literal>on</literal> to
+          <literal>off</literal> will give an error when there are prepared
+          transactions done by the logical replication worker.
+          The default is <literal>false</literal>.
+         </para>
+        </listitem>
+       </varlistentry>

This explanation seems a bit repetitive. I think it can be improved as follows:

SUGGESTION
Specifies if the ALTER SUBSCRIPTION can be forced to proceed instead
of giving an error.

There is currently only one scenario where this parameter has any
effect: When altering two_phase option from true to false it is
possible for there to be incomplete prepared transactions done by the
logical replication worker (from when two_phase parameter was still
true). If force_alter is false, then this will give an error; if
force_alter is true, then the incomplete prepared transactions are
aborted and the alter will proceed.

The default is false.

======
src/backend/commands/subscriptioncmds.c

4.4. CreateSubscription

  values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
+ values[Anum_pg_subscription_subforcealter] = BoolGetDatum(opts.force_alter);
  values[Anum_pg_subscription_subconninfo - 1] =

Hmm, looks like a bug. Shouldn't that index say -1?

~~~
4.5. AlterSubscription

+ /*
+ * Abort prepared transactions only if
+ * 'force_alter' option is true. Otherwise raise
+ * an ERROR.
+ */
+ if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER))
+ {
+ if (!opts.force_alter)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = off"),
+ errhint("Resolve these transactions or set %s, and then try again.",
+ "force_alter = true")));
+ }
+ else
+ {
+ if (!sub->forcealter)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = off"),
+ errhint("Resolve these transactions or set %s, and then try again.",
+ "force_alter = true")));
+ }
+

IIUC this code can be simplified to remove the error duplication.
Something like below:

SUGGESTION

bool raise_error = IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) ?
!opts.force_alter : !sub->forcealter;

if (raise_error)
  ereport(ERROR,
    (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
    errmsg("cannot alter %s when there are prepared transactions",
      "two_phase = off"),
    errhint("Resolve these transactions or set %s, and then try again.",
      "force_alter = true")));

======
src/bin/pg_dump/pg_dump.c

4.6. getSubscriptions

+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ " s.subforcealter\n");
+ else
+ appendPQExpBuffer(query,
+   " false AS subforcealter\n");
+
+

4.6a.
Should this just be combined with the existing "if
(fout->remoteVersion >= 170000)" for failover?

~

4.6b.
Double blank lines.

======
src/bin/psql/describe.c

4.7.
+ if (pset.sversion >= 170000)
+ appendPQExpBuffer(&buf,
+   ", subforcealter AS \"%s\"\n",
+   gettext_noop("Force_alter"));

IMO the column title should be "Force alter" (i.e. without the underscore)

======
src/include/catalog/pg_subscription.h

4.8. CATALOG

+ bool subforcealter; /* True if we allow to drop prepared transactions
+ * when altering two_phase "on"->"off". */

I think this is not actually the description of 'force_alter'. What
you wrote just happens to be the only case that this option currently
works for. Maybe a more correct description is something like below.

SUGGESTION
True allows the ALTER SUBSCRIPTION command to proceed under conditions
that would otherwise result in an error. Currently, 'force_alter' only
has an effect when altering the two_phase option from "true" to
"false".

~~~

4.9. struct Subscription

+ bool forcealter; /* True if we allow to drop prepared
+ * transactions when altering two_phase
+ * "on"->"off". */

Ditto the previous review comment.

======
src/test/regress/expected/subscription.out

4.10.
-
                                           List of subscriptions
-       Name       |           Owner           | Enabled | Publication
| Binary | Streaming | Two-phase commit | Disable on error | Origin |
Password required | Run as owner? | Failover | Synchronous commit |
      Conninfo           | Skip LSN

-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f       | {testpub}
| f      | off       | d                | f                | none   |
t                 | f             | f        | off                |
dbname=regress_doesnotexist | 0/0
+
                                                  List of
subscriptions
+       Name       |           Owner           | Enabled | Publication
| Binary | Streaming | Two-phase commit | Disable on error | Origin |
Password required | Run as owner? | Failover | Force_alter |
Synchronous commit |          Conninfo           | Skip LSN

+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+-------------+--------------------+-----------------------------+----------

The column heading should be "Force alter", as already mentioned by an
earlier review comment (#4.7)

======
src/test/subscription/t/099_twophase_added.pl

4.11.

+# Alter the two_phase with the force_alter option. Apart from the the last
+# ALTER SUBSCRIPTION command, the command will abort the prepared transaction
+# and succeed.

There is typo "the the" and the wording is a bit strange. Why not just say:

SUGGESTION
Alter the two_phase true to false with the force_alter option enabled.
This command will succeed after aborting the prepared transaction.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! Here is new version patch.

> //////////
> Patch v9-0002
> //////////
> 
> ======
> Commit Message
> 
> 2.1.
> Regarding the off->on case, the logical replication already has a
> mechanism for it, so there is no need to do anything special for the
> on->off case; however, we must connect to the publisher and expressly
> change the parameter. The operation cannot be rolled back, and
> altering the parameter from "on" to "off" within a transaction is
> prohibited.
> 
> In the opposite case, there is no need to prevent this because the
> logical replication worker already had the mechanism to alter the slot
> option at a convenient time.
> 
> ~
> 
> This explanation seems to be going around in circles, without giving
> any new information:
> 
> AFAICT, "Regarding the off->on case, the logical replication already
> has a mechanism for it, so there is no need to do anything special for
> the on->off case;"
> 
> is saying pretty much the same as:
> 
> "In the opposite case, there is no need to prevent this because the
> logical replication worker already had the mechanism to alter the slot
> option at a convenient time."
> 
> But, what I hoped for in previous review comments was an explanation
> somewhat less vague than "already has a mechanism" or "already had the
> mechanism". Can't this have just 1 or 2 lines to say WHAT is that
> existing mechanism for the "off" to "on" case, and WHY that means
> there is nothing special to do in that scenario?
>

Reworded. Thought?

> 2.2. AlterSubscription
> 
>   /*
> - * The changed two_phase option of the slot can't be rolled
> - * back.
> + * Since altering the two_phase option of subscriptions
> + * also leads to changing the slot option, this command
> + * cannot be rolled back. So prevent this if we are in a
> + * transaction block. In the opposite case, there is no
> + * need to prevent this because the logical replication
> + * worker already had the mechanism to alter the slot
> + * option at a convenient time.
>   */
> 
> (Same previous review comments, and same as my review comment for the
> commit message above).
> 
> I don't think "already had the mechanism" is enough explanation.
> 
> Also, the 2nd sentence doesn't make sense here because the comment
> only said "altering the slot option" -- it didn't say it was altering
> it to "on" or altering it to "off", so "the opposite case" has no
> meaning.

Fixed.

> 2.3. AlterSubscription
> 
>   /*
> - * Try to acquire the connection necessary for altering slot.
> + * Check the need to alter the replication slot. Failover and two_phase
> + * options are controlled by both the publisher (as a slot option) and the
> + * subscriber (as a subscription option). The slot option must be altered
> + * only when changing "on" to "off". Because in opposite case, the logical
> + * replication worker already has the mechanism to do so at a convenient
> + * time.
> + */
> + update_failover = replaces[Anum_pg_subscription_subfailover - 1];
> + update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1]
> &&
> + !opts.twophase);
> 
> This is again the same as other review comments above. Probably, when
> some better explanation can be found for "already has the mechanism to
> do so at a convenient time." then all of these places can be changed
> using similar text.

Added a reference.

> 
> //////////
> Patch v9-0003
> //////////
> 
> There are some imperfect code comments but AFAIK they are the same
> ones from patch 0002. I think patch 0003 is just moving those comments
> to different places, so probably they would already be addressed by
> patch 0002.
>

The comment was moved, so no need to modify here.

> ======
> doc/src/sgml/catalogs.sgml
> 
> 4.1.
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>subforcealter</structfield> <type>bool</type>
> +      </para>
> +      <para>
> +       If true, the subscription can be altered <literal>two_phase</literal>
> +       option, even if there are prepared transactions
> +      </para></entry>
> +     </row>
> +
> 
> BEFORE
> If true, the subscription can be altered <literal>two_phase</literal>
> option, even if there are prepared transactions
> 
> SUGGESTION
> If true, then the ALTER SUBSCRIPTION command can disable
> <literal>two_phase</literal> option, even if there are uncommitted
> prepared transactions from when <literal>two_phase</literal> was
> enabled

Fixed, added a link for ALTER SUBSCRIPTION.

> 
> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 4.2.
> -
> -     <para>
> -      The <literal>two_phase</literal> parameter can only be altered when
> the
> -      subscription is disabled. When altering the parameter from
> <literal>on</literal>
> -      to <literal>off</literal>, the backend process checks for any incomplete
> -      prepared transactions done by the logical replication worker (from when
> -      <literal>two_phase</literal> parameter was still <literal>on</literal>)
> -      and, if any are found, those are aborted.
> -     </para>
> 
> Well, I still think there ought to be some mention of the relationship
> between 'force_alter' and 'two_phase' given on this ALTER SUBSCRIPTION
> page. Then the user can cross-reference to read what the 'force_alter'
> actually does.
>

Revived the content, and added an link. Thought?

> ======
> doc/src/sgml/ref/create_subscription.sgml
> 
> 4.3.
> +
> +       <varlistentry id="sql-createsubscription-params-with-force-alter">
> +        <term><literal>force_alter</literal>
> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies whether the subscription can be altered
> +          <literal>two_phase</literal> option, even if there are prepared
> +          transactions. If specified, the backend process checks for any
> +          incomplete prepared transactions done by the logical replication
> +          worker (from when <literal>two_phase</literal> parameter was
> still
> +          <literal>on</literal>), if any are found, those are aborted.
> +          Otherwise, Altering the parameter from <literal>on</literal> to
> +          <literal>off</literal> will give an error when there are prepared
> +          transactions done by the logical replication worker.
> +          The default is <literal>false</literal>.
> +         </para>
> +        </listitem>
> +       </varlistentry>
> 
> This explanation seems a bit repetitive. I think it can be improved as follows:
> 
> SUGGESTION
> Specifies if the ALTER SUBSCRIPTION can be forced to proceed instead
> of giving an error.
> 
> There is currently only one scenario where this parameter has any
> effect: When altering two_phase option from true to false it is
> possible for there to be incomplete prepared transactions done by the
> logical replication worker (from when two_phase parameter was still
> true). If force_alter is false, then this will give an error; if
> force_alter is true, then the incomplete prepared transactions are
> aborted and the alter will proceed.
> 
> The default is false.

Fixed, but added attributes.

> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 4.4. CreateSubscription
> 
>   values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
> + values[Anum_pg_subscription_subforcealter] =
> BoolGetDatum(opts.force_alter);
>   values[Anum_pg_subscription_subconninfo - 1] =
> 
> Hmm, looks like a bug. Shouldn't that index say -1?
>

Right, fixed.

> ~~~
> 4.5. AlterSubscription
> 
> + /*
> + * Abort prepared transactions only if
> + * 'force_alter' option is true. Otherwise raise
> + * an ERROR.
> + */
> + if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER))
> + {
> + if (!opts.force_alter)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot alter %s when there are prepared transactions",
> + "two_phase = off"),
> + errhint("Resolve these transactions or set %s, and then try again.",
> + "force_alter = true")));
> + }
> + else
> + {
> + if (!sub->forcealter)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot alter %s when there are prepared transactions",
> + "two_phase = off"),
> + errhint("Resolve these transactions or set %s, and then try again.",
> + "force_alter = true")));
> + }
> +
> 
> IIUC this code can be simplified to remove the error duplication.
> Something like below:
> 
> SUGGESTION
> 
> bool raise_error = IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) ?
> !opts.force_alter : !sub->forcealter;
> 
> if (raise_error)
>   ereport(ERROR,
>     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>     errmsg("cannot alter %s when there are prepared transactions",
>       "two_phase = off"),
>     errhint("Resolve these transactions or set %s, and then try again.",
>       "force_alter = true")));
>

Modified.

> ======
> src/bin/pg_dump/pg_dump.c
> 
> 4.6. getSubscriptions
> 
> + if (fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query,
> + " s.subforcealter\n");
> + else
> + appendPQExpBuffer(query,
> +   " false AS subforcealter\n");
> +
> +
> 
> 4.6a.
> Should this just be combined with the existing "if
> (fout->remoteVersion >= 170000)" for failover?

This was intentional. Features for PG17 have already been frozen, so
the patch will be pushed for PG18. After removeVersion is bumped, 
I want to replace to "(fout->remoteVersion >= 180000)"

> 
> ~
> 
> 4.6b.
> Double blank lines.

Fixed.

> src/bin/psql/describe.c
> 
> 4.7.
> + if (pset.sversion >= 170000)
> + appendPQExpBuffer(&buf,
> +   ", subforcealter AS \"%s\"\n",
> +   gettext_noop("Force_alter"));
> 
> IMO the column title should be "Force alter" (i.e. without the underscore)
>

Fixed.

> ======
> src/include/catalog/pg_subscription.h
> 
> 4.8. CATALOG
> 
> + bool subforcealter; /* True if we allow to drop prepared transactions
> + * when altering two_phase "on"->"off". */
> 
> I think this is not actually the description of 'force_alter'. What
> you wrote just happens to be the only case that this option currently
> works for. Maybe a more correct description is something like below.
> 
> SUGGESTION
> True allows the ALTER SUBSCRIPTION command to proceed under conditions
> that would otherwise result in an error. Currently, 'force_alter' only
> has an effect when altering the two_phase option from "true" to
> "false".
>

Hmm. Seems bit long, but used yours.

> ~~~
> 
> 4.9. struct Subscription
> 
> + bool forcealter; /* True if we allow to drop prepared
> + * transactions when altering two_phase
> + * "on"->"off". */
> 
> Ditto the previous review comment.
>

Ditto.

> ======
> src/test/regress/expected/subscription.out
> 
> 4.10.
> -
>                                            List of subscriptions
> -       Name       |           Owner           | Enabled | Publication
> | Binary | Streaming | Two-phase commit | Disable on error | Origin |
> Password required | Run as owner? | Failover | Synchronous commit |
>       Conninfo           | Skip LSN
> -------------------+---------------------------+---------+-------------+--------
> +-----------+------------------+------------------+--------+-------------------
> +---------------+----------+--------------------+-----------------------------+
> ----------
> - regress_testsub4 | regress_subscription_user | f       | {testpub}
> | f      | off       | d                | f                | none   |
> t                 | f             | f        | off                |
> dbname=regress_doesnotexist | 0/0
> +
>                                                   List of
> subscriptions
> +       Name       |           Owner           | Enabled | Publication
> | Binary | Streaming | Two-phase commit | Disable on error | Origin |
> Password required | Run as owner? | Failover | Force_alter |
> Synchronous commit |          Conninfo           | Skip LSN
> +------------------+---------------------------+---------+-------------+-------
> -+-----------+------------------+------------------+--------+------------------
> -+---------------+----------+-------------+--------------------+---------------
> --------------+----------
> 
> The column heading should be "Force alter", as already mentioned by an
> earlier review comment (#4.7)

Yeah, fixed.

> src/test/subscription/t/099_twophase_added.pl
> 
> 4.11.
> 
> +# Alter the two_phase with the force_alter option. Apart from the the last
> +# ALTER SUBSCRIPTION command, the command will abort the prepared
> transaction
> +# and succeed.
> 
> There is typo "the the" and the wording is a bit strange. Why not just say:
> 
> SUGGESTION
> Alter the two_phase true to false with the force_alter option enabled.
> This command will succeed after aborting the prepared transaction.

Fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! New patch is available in [1].

> I'm having second thoughts about how these patches mention the option
> values "on|off". These are used in the ALTER SUBSCRIPTION document
> page for 'two_phase' and 'failover' parameters, and then those
> "on|off" get propagated to the code comments, error messages, and
> tests...
> 
> Now I see that on the CREATE SUBSCRIPTION page [1], every boolean
> parameter (even including 'two_phase' and 'failover') is described in
> terms of "true|false" (not "on|off").

Hmm. But I could sentences like "The default value is off,...". Also, in alter_subscription.sgml,
"on|off" notation has already been used. Not sure, but I felt there are no rules around here.

> In hindsight, it is probably better to refer only to true|false
> everywhere for these boolean parameters, instead of sometimes using
> different values like on|off.
> 
> What do you think?

It's OK for me to make message/code comments consistent. Not sure the documentation,
but followed only my part.

[1]:
https://www.postgresql.org/message-id/OSBPR01MB2552F66463EFCFD654E87C09F5E32%40OSBPR01MB2552.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Hi Kuroda-san. Here are my review comments for latest v10* patches.

//////////
patch v10-0001
//////////

No changes. No comments.

//////////
patch v10-0002
//////////

======
Commit message

2.1.
Regarding the false->true case, the backend process alters the subtwophase to
LOGICALREP_TWOPHASE_STATE_PENDING once. After the subscription is enabled, a new
logical replication worker requests to change the two_phase option of its slot
from pending to true after the initial data synchronization is done. The code
path is the same as the case in which two_phase is initially set to true, so
there is no need to do something remarkable. However, for the true->false case,
the backend must connect to the publisher and expressly change the parameter
because the apply worker does not alter the option to false. The
operation cannot
be rolled back, and altering the parameter from "true" to "false" within a
transaction is prohibited.

~

BEFORE
The operation cannot be rolled back, and altering the parameter from
"true" to "false" within a transaction is prohibited.

SUGGESTION
Because this operation cannot be rolled back, altering the two_phase
parameter from "true" to "false" within a transaction is prohibited.

======
doc/src/sgml/ref/alter_subscription.sgml

2.2.
    <command>ALTER SUBSCRIPTION ... SET (failover = on|off)</command> and
-   <command>ALTER SUBSCRIPTION ... SET (two_phase = on|off)</command>
+   <command>ALTER SUBSCRIPTION ... SET (two_phase = off)</command>

I wasn't sure why you chose to keep on|off here instead of true|false,
since in subsequence patch 0003 you changed it true/false everywhere
as discussed in previous reviews.

OTOH if you only did this to be consistent with the "failover=on|off"
then that is OK; but in that case I might raise a separate hackers
thread to propose those should also be changed to true|false for
consistency with the parameer listed on the CREATE SUBSCRIPTION page.
What do you think?

======
src/backend/commands/subscriptioncmds.c

2.3.
  /*
- * The changed two_phase option of the slot can't be rolled
- * back.
+ * Altering the parameter from "true" to "false" within a
+ * transaction is prohibited. Since the apply worker does
+ * not alter the slot option to false, the backend must
+ * connect to the publisher and expressly change the
+ * parameter.
+ *
+ * There is no need to do something remarkable regarding
+ * the "false" to "true" case; the backend process alters
+ * subtwophase to LOGICALREP_TWOPHASE_STATE_PENDING once.
+ * After the subscription is enabled, a new logical
+ * replication worker requests to change the two_phase
+ * option of its slot when the initial data synchronization
+ * is done. The code path is the same as the case in which
+ * two_phase is initially set to true.
  */

BEFORE
...worker requests to change the two_phase option of its slot when...

SUGGESTION
...worker requests to change the two_phase option of its slot from
pending to true when...

======
src/test/subscription/t/099_twophase_added.pl

2.4.
+#####################
+# Check the case that prepared transactions exist on the publisher node.
+#
+# Since the two_phase is "off", then normally, this PREPARE will do nothing
+# until the COMMIT PREPARED, but in this test, we toggle the two_phase to
+# "true" again before the COMMIT PREPARED happens.

/Since the two_phase is "off"/Since the two_phase is "false"/

//////////
patch v10-0003
//////////

======
src/backend/commands/subscriptioncmds.c

3.1. AlterSubscription

+ * If two_phase was enabled, there is a possibility that
+ * transactions have already been PREPARE'd. They must be
+ * checked and rolled back.
  */
  if (!opts.twophase)

I think it will less ambiguous if you modify this to say "If two_phase
was previously enabled"

~~~

3.2.
if (!opts.twophase)
{
List *prepared_xacts;

/*
* Altering the parameter from "true" to "false" within
* a transaction is prohibited. Since the apply worker
* does not alter the slot option to false, the backend
* must connect to the publisher and expressly change
* the parameter.
*
* There is no need to do something remarkable
* regarding the "false" to "true" case; the backend
* process alters subtwophase to
* LOGICALREP_TWOPHASE_STATE_PENDING once. After the
* subscription is enabled, a new logical replication
* worker requests to change the two_phase option of
* its slot when the initial data synchronization is
* done. The code path is the same as the case in which
* two_phase is initially set to true.
*/
if (!opts.twophase)
PreventInTransactionBlock(isTopLevel,
"ALTER SUBSCRIPTION ... SET (two_phase = false)");

/*
* To prevent prepared transactions from being
* isolated, they must manually be aborted.
*/
if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
(prepared_xacts = GetGidListBySubid(subid)) != NIL)
{
/* Abort all listed transactions */
foreach_ptr(char, gid, prepared_xacts)
FinishPreparedTransaction(gid, false);

list_free_deep(prepared_xacts);
}
}

/* Change system catalog acoordingly */
values[Anum_pg_subscription_subtwophasestate - 1] =
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
}

~

Why is "if (!opts.twophase)" being checked at the top and then
immediately being checed again here:
+ if (!opts.twophase)
+ PreventInTransactionBlock(isTopLevel,
+ "ALTER SUBSCRIPTION ... SET (two_phase = false)");

And then again here:
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);

There is no need to re-check a flag that was already checked, so
clearly some of this logic/code is either wrong or redundant.

======
src/test/subscription/t/099_twophase_added.pl

(Let's change these on|off to true|false to match what you did already
in patch 0002).

3.3.
+#####################
+# Check the case that prepared transactions exist on the subscriber node
+#
+# If the two_phase is altering from "on" to "off" and there are prepared
+# transactions on the subscriber, they must be aborted. This test checks it.


/off/false/

/on/true/

~~~

3.4.
+# Verify the prepared transaction has been replicated to the subscriber because
+# two_phase is set to "on".

/on/true/

~~~

3.5.
+# Toggle the two_phase to "off" before the COMMIT PREPARED
+$node_subscriber->safe_psql(
+    'postgres', "
+    ALTER SUBSCRIPTION regress_sub DISABLE;
+    ALTER SUBSCRIPTION regress_sub SET (two_phase = off);
+    ALTER SUBSCRIPTION regress_sub ENABLE;");

/off/false/

/two_phase = off/two_phase = false/

~~~

3.6.
+# Verify any prepared transactions are aborted because two_phase is changed to
+# "off".

/off/false/

//////////
patch v10-0004
//////////

======
4.g1. GENERAL - document rendering fails

FYI - The document failed to build after I apply patch 0003. Did you try it?

In my environment it reported some unbalanced tags:

ref/create_subscription.sgml:448: parser error : Opening and ending
tag mismatch: link line 436 and para
         </para>
                ^
ref/create_subscription.sgml:449: parser error : Opening and ending
tag mismatch: para line 435 and listitem
        </listitem>

etc.

======
doc/src/sgml/ref/alter_subscription.sgml

4.1.
      <para>
       The <literal>two_phase</literal> parameter can only be altered when the
-      subscription is disabled. When altering the parameter from
<literal>true</literal>
-      to <literal>false</literal>, the backend process checks for any
incomplete
-      prepared transactions done by the logical replication worker (from when
-      <literal>two_phase</literal> parameter was still <literal>true</literal>)
-      and, if any are found, those are aborted.
+      subscription is disabled. Altering the parameter from
<literal>true</literal>
+      to <literal>false</literal> will give an error when when there are
+      prepared transactions done by the logical replication worker. If you want
+      to alter the parameter forcibly in this case,
+      <link linkend="sql-createsubscription-params-with-force-alter"><literal>force_alter</literal></link>
+      option must be set to <literal>true</literal> at the same time.
      </para>

TYPO: "when when"

Why is necessary to say "at the same time"?

======
doc/src/sgml/ref/create_subscription.sgml

4.2.
+       <varlistentry id="sql-createsubscription-params-with-force-alter">
+        <term><literal>force_alter</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          Specifies if the <link
linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION</command>
+          can be forced to proceed instead of giving an error. There is
+          currently only one scenario where this parameter has any effect: When
+          altering <literal>two_phase</literal> option from
<literal>true</literal>
+          to <literal>false</literal> it is possible for there to be incomplete
+          prepared transactions done by the logical replication worker (from
+          when <literal>two_phase</literal> parameter was still
<literal>true</literal>).
+          If <literal>force_alter</literal> is <literal>false</literal>, then
+          this will give an error; if <literal>force_alter</literal> is
+          <literal>true</literal>, then the incomplete prepared transactions
+          are aborted and the alter will proceed.
+          The default is <literal>false</literal>.
+         </para>
+        </listitem>
+       </varlistentry>

IMO this will be better broken into multiple paragraphs.

1. Specifies...
2. There is...
3. The default is...

======
src/test/subscription/t/099_twophase_added.pl

(Let's change all the on|off to true|false like you already did in patch 0002.

4.3.
+# Try altering the two_phase option to "off." The command will fail since there
+# is a prepared transaction and the 'force_alter' option is not specified as
+# true.
+my $stdout;
+my $stderr;

/off./false/

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Tue, May 14, 2024 at 10:03 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Peter,
>
...
> > 4.11.
> >
> > +# Alter the two_phase with the force_alter option. Apart from the the last
> > +# ALTER SUBSCRIPTION command, the command will abort the prepared
> > transaction
> > +# and succeed.
> >
> > There is typo "the the" and the wording is a bit strange. Why not just say:
> >
> > SUGGESTION
> > Alter the two_phase true to false with the force_alter option enabled.
> > This command will succeed after aborting the prepared transaction.
>
> Fixed.
>

You wrote "Fixed" for that patch v9-0004 suggestion but I don't think
anything was changed at all. Accidentally missed?

======
Kind Regards,
Peter Smith.
Futjisu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! Here are new patches.

 > 
> //////////
> patch v10-0002
> //////////
> 
> ======
> Commit message
> 
> 2.1.
> Regarding the false->true case, the backend process alters the subtwophase to
> LOGICALREP_TWOPHASE_STATE_PENDING once. After the subscription is
> enabled, a new
> logical replication worker requests to change the two_phase option of its slot
> from pending to true after the initial data synchronization is done. The code
> path is the same as the case in which two_phase is initially set to true, so
> there is no need to do something remarkable. However, for the true->false case,
> the backend must connect to the publisher and expressly change the parameter
> because the apply worker does not alter the option to false. The
> operation cannot
> be rolled back, and altering the parameter from "true" to "false" within a
> transaction is prohibited.
> 
> ~
> 
> BEFORE
> The operation cannot be rolled back, and altering the parameter from
> "true" to "false" within a transaction is prohibited.
> 
> SUGGESTION
> Because this operation cannot be rolled back, altering the two_phase
> parameter from "true" to "false" within a transaction is prohibited.

Fixed.

> 
> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.2.
>     <command>ALTER SUBSCRIPTION ... SET (failover = on|off)</command>
> and
> -   <command>ALTER SUBSCRIPTION ... SET (two_phase =
> on|off)</command>
> +   <command>ALTER SUBSCRIPTION ... SET (two_phase = off)</command>
> 
> I wasn't sure why you chose to keep on|off here instead of true|false,
> since in subsequence patch 0003 you changed it true/false everywhere
> as discussed in previous reviews.
> 
> OTOH if you only did this to be consistent with the "failover=on|off"
> then that is OK; but in that case I might raise a separate hackers
> thread to propose those should also be changed to true|false for
> consistency with the parameer listed on the CREATE SUBSCRIPTION page.
> What do you think?

Yeah, I did not change here, because other parameters were notated as
on/off. I found you started the forked thread [1] so I will revise the patch
after it was accepted.

> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 2.3.
>   /*
> - * The changed two_phase option of the slot can't be rolled
> - * back.
> + * Altering the parameter from "true" to "false" within a
> + * transaction is prohibited. Since the apply worker does
> + * not alter the slot option to false, the backend must
> + * connect to the publisher and expressly change the
> + * parameter.
> + *
> + * There is no need to do something remarkable regarding
> + * the "false" to "true" case; the backend process alters
> + * subtwophase to LOGICALREP_TWOPHASE_STATE_PENDING once.
> + * After the subscription is enabled, a new logical
> + * replication worker requests to change the two_phase
> + * option of its slot when the initial data synchronization
> + * is done. The code path is the same as the case in which
> + * two_phase is initially set to true.
>   */
> 
> BEFORE
> ...worker requests to change the two_phase option of its slot when...
> 
> SUGGESTION
> ...worker requests to change the two_phase option of its slot from
> pending to true when...

Fixed.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> 2.4.
> +#####################
> +# Check the case that prepared transactions exist on the publisher node.
> +#
> +# Since the two_phase is "off", then normally, this PREPARE will do nothing
> +# until the COMMIT PREPARED, but in this test, we toggle the two_phase to
> +# "true" again before the COMMIT PREPARED happens.
> 
> /Since the two_phase is "off"/Since the two_phase is "false"/

Fixed.

> 
> //////////
> patch v10-0003
> //////////
> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3.1. AlterSubscription
> 
> + * If two_phase was enabled, there is a possibility that
> + * transactions have already been PREPARE'd. They must be
> + * checked and rolled back.
>   */
>   if (!opts.twophase)
> 
> I think it will less ambiguous if you modify this to say "If two_phase
> was previously enabled"

Fixed.

> 
> ~~~
> 
> 3.2.
> if (!opts.twophase)
> {
> List *prepared_xacts;
> 
> /*
> * Altering the parameter from "true" to "false" within
> * a transaction is prohibited. Since the apply worker
> * does not alter the slot option to false, the backend
> * must connect to the publisher and expressly change
> * the parameter.
> *
> * There is no need to do something remarkable
> * regarding the "false" to "true" case; the backend
> * process alters subtwophase to
> * LOGICALREP_TWOPHASE_STATE_PENDING once. After the
> * subscription is enabled, a new logical replication
> * worker requests to change the two_phase option of
> * its slot when the initial data synchronization is
> * done. The code path is the same as the case in which
> * two_phase is initially set to true.
> */
> if (!opts.twophase)
> PreventInTransactionBlock(isTopLevel,
> "ALTER SUBSCRIPTION ... SET (two_phase = false)");
> 
> /*
> * To prevent prepared transactions from being
> * isolated, they must manually be aborted.
> */
> if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
> (prepared_xacts = GetGidListBySubid(subid)) != NIL)
> {
> /* Abort all listed transactions */
> foreach_ptr(char, gid, prepared_xacts)
> FinishPreparedTransaction(gid, false);
> 
> list_free_deep(prepared_xacts);
> }
> }
> 
> /* Change system catalog acoordingly */
> values[Anum_pg_subscription_subtwophasestate - 1] =
> CharGetDatum(opts.twophase ?
> LOGICALREP_TWOPHASE_STATE_PENDING :
> LOGICALREP_TWOPHASE_STATE_DISABLED);
> replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
> }
> 
> ~
> 
> Why is "if (!opts.twophase)" being checked at the top and then
> immediately being checed again here:
> + if (!opts.twophase)
> + PreventInTransactionBlock(isTopLevel,
> + "ALTER SUBSCRIPTION ... SET (two_phase = false)");

Oh, this was caused by wrong git operations.

> And then again here:
> CharGetDatum(opts.twophase ?
> LOGICALREP_TWOPHASE_STATE_PENDING :
> LOGICALREP_TWOPHASE_STATE_DISABLED);
> 
> There is no need to re-check a flag that was already checked, so
> clearly some of this logic/code is either wrong or redundant.

Right. I added a new variable to store the value to be changed. Thouth?

 > 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> (Let's change these on|off to true|false to match what you did already
> in patch 0002).
> 
> 3.3.
> +#####################
> +# Check the case that prepared transactions exist on the subscriber node
> +#
> +# If the two_phase is altering from "on" to "off" and there are prepared
> +# transactions on the subscriber, they must be aborted. This test checks it.
> 
> 
> /off/false/
> 
> /on/true/

Fixed.

> 
> ~~~
> 
> 3.4.
> +# Verify the prepared transaction has been replicated to the subscriber because
> +# two_phase is set to "on".
> 
> /on/true/

Fixed.

> 
> ~~~
> 
> 3.5.
> +# Toggle the two_phase to "off" before the COMMIT PREPARED
> +$node_subscriber->safe_psql(
> +    'postgres', "
> +    ALTER SUBSCRIPTION regress_sub DISABLE;
> +    ALTER SUBSCRIPTION regress_sub SET (two_phase = off);
> +    ALTER SUBSCRIPTION regress_sub ENABLE;");
> 
> /off/false/
> 
> /two_phase = off/two_phase = false/

Fixed.

> 
> ~~~
> 
> 3.6.
> +# Verify any prepared transactions are aborted because two_phase is changed
> to
> +# "off".
> 
> /off/false/

Fixed.

> 
> //////////
> patch v10-0004
> //////////
> 
> ======
> 4.g1. GENERAL - document rendering fails
> 
> FYI - The document failed to build after I apply patch 0003. Did you try it?
> 
> In my environment it reported some unbalanced tags:
> 
> ref/create_subscription.sgml:448: parser error : Opening and ending
> tag mismatch: link line 436 and para
>          </para>
>                 ^
> ref/create_subscription.sgml:449: parser error : Opening and ending
> tag mismatch: para line 435 and listitem
>         </listitem>
> 
> etc.

Oh, I forgot to run `make check`. Sorry. It seemed that I missed to close <link> tag.

> 
> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 4.1.
>       <para>
>        The <literal>two_phase</literal> parameter can only be altered when
> the
> -      subscription is disabled. When altering the parameter from
> <literal>true</literal>
> -      to <literal>false</literal>, the backend process checks for any
> incomplete
> -      prepared transactions done by the logical replication worker (from when
> -      <literal>two_phase</literal> parameter was still
> <literal>true</literal>)
> -      and, if any are found, those are aborted.
> +      subscription is disabled. Altering the parameter from
> <literal>true</literal>
> +      to <literal>false</literal> will give an error when when there are
> +      prepared transactions done by the logical replication worker. If you want
> +      to alter the parameter forcibly in this case,
> +      <link
> linkend="sql-createsubscription-params-with-force-alter"><literal>force_alter
> </literal></link>
> +      option must be set to <literal>true</literal> at the same time.
>       </para>
> 
> TYPO: "when when"

Removed.

> Why is necessary to say "at the same time"?

Not needed. Fixed.

> 
> ======
> doc/src/sgml/ref/create_subscription.sgml
> 
> 4.2.
> +       <varlistentry id="sql-createsubscription-params-with-force-alter">
> +        <term><literal>force_alter</literal>
> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies if the <link
> linkend="sql-altersubscription"><command>ALTER
> SUBSCRIPTION</command>
> +          can be forced to proceed instead of giving an error. There is
> +          currently only one scenario where this parameter has any effect:
> When
> +          altering <literal>two_phase</literal> option from
> <literal>true</literal>
> +          to <literal>false</literal> it is possible for there to be incomplete
> +          prepared transactions done by the logical replication worker (from
> +          when <literal>two_phase</literal> parameter was still
> <literal>true</literal>).
> +          If <literal>force_alter</literal> is <literal>false</literal>, then
> +          this will give an error; if <literal>force_alter</literal> is
> +          <literal>true</literal>, then the incomplete prepared transactions
> +          are aborted and the alter will proceed.
> +          The default is <literal>false</literal>.
> +         </para>
> +        </listitem>
> +       </varlistentry>
> 
> IMO this will be better broken into multiple paragraphs.
> 
> 1. Specifies...
> 2. There is...
> 3. The default is...

Separated.

> 
> ======
> src/test/subscription/t/099_twophase_added.pl
> 
> (Let's change all the on|off to true|false like you already did in patch 0002.
> 
> 4.3.
> +# Try altering the two_phase option to "off." The command will fail since there
> +# is a prepared transaction and the 'force_alter' option is not specified as
> +# true.
> +my $stdout;
> +my $stderr;
> 
> /off./false/

Fixed.

[1]: https://www.postgresql.org/message-id/CAHut%2BPs-RqrggaJU5w85BbeQzw9CLmmLgADVJoJ%3Dxx_4D5CWvw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 

Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

> You wrote "Fixed" for that patch v9-0004 suggestion but I don't think
> anything was changed at all. Accidentally missed?

Sorry, I missed to do `git add` after the revision.
The change was also included in new patch [1].

[1]:
https://www.postgresql.org/message-id/OSBPR01MB25522052F9F3E3AAD3BA2A8CF5ED2%40OSBPR01MB2552.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Hi, Here are my remaining review comments for the latest v11* patches.

//////////
v11-0001
//////////

No changes. No comments.

//////////
v11-0002
//////////

======
doc/src/sgml/ref/alter_subscription.sgml

2.1.
    <command>ALTER SUBSCRIPTION ... SET (failover = on|off)</command> and
-   <command>ALTER SUBSCRIPTION ... SET (two_phase = on|off)</command>
+   <command>ALTER SUBSCRIPTION ... SET (two_phase = off)</command>

My other thread patch has already been pushed [1], so now you can
modify this to say "true|false" as previously suggested.


//////////
v11-0003
//////////

======
src/backend/commands/subscriptioncmds.c

3.1. AlterSubscription

+ subtwophase = LOGICALREP_TWOPHASE_STATE_DISABLED;
+ }
+ else
+ subtwophase = LOGICALREP_TWOPHASE_STATE_PENDING;
+
+
  /* Change system catalog acoordingly */
  values[Anum_pg_subscription_subtwophasestate - 1] =
- CharGetDatum(opts.twophase ?
- LOGICALREP_TWOPHASE_STATE_PENDING :
- LOGICALREP_TWOPHASE_STATE_DISABLED);
+ CharGetDatum(subtwophase);
  replaces[Anum_pg_subscription_subtwophasestate - 1] = true;

Sorry, I don't think that 'subtwophase' change is an improvement --
IMO the existing ternary code was fine as-is.

I only reported the excessive flag checking in the previous v10-0003
review because of some extra "if (!opts.twophase)" code but that was
caused by what you called "wrong git operations." and is already fixed
now.

//////////
v11-0004
//////////

======
src/sgml/catalogs.sgml

4.1.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>subforcealter</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, then the <link
linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION</command></link>
+       can disable <literal>two_phase</literal> option, even if there are
+       uncommitted prepared transactions from when <literal>two_phase</literal>
+       was enabled
+      </para></entry>
+     </row>
+

I think this description should be changed to say what it *really*
does. IMO, the stuff about 'two_phase' is more like a side-effect.

SUGGESTION (this is just pseudo-code. You can add the CREATE
SUBSCRIPTION 'force_alter' link appropriately)

If true, then the <command>ALTER SUBSCRIPTION</command> command can
sometimes be forced to proceed instead of giving an error. See
<link>force_alter</link> parameter for details about when this might
be useful.

======
[1] https://github.com/postgres/postgres/commit/fa65a022db26bc446fb67ce1d7ac543fa4bb72e4

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for reviewing! Here are new patches.

> ======
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.1.
>     <command>ALTER SUBSCRIPTION ... SET (failover = on|off)</command>
> and
> -   <command>ALTER SUBSCRIPTION ... SET (two_phase =
> on|off)</command>
> +   <command>ALTER SUBSCRIPTION ... SET (two_phase = off)</command>
> 
> My other thread patch has already been pushed [1], so now you can
> modify this to say "true|false" as previously suggested.

Fixed accordingly.

> //////////
> v11-0003
> //////////
> 
> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3.1. AlterSubscription
> 
> + subtwophase = LOGICALREP_TWOPHASE_STATE_DISABLED;
> + }
> + else
> + subtwophase = LOGICALREP_TWOPHASE_STATE_PENDING;
> +
> +
>   /* Change system catalog acoordingly */
>   values[Anum_pg_subscription_subtwophasestate - 1] =
> - CharGetDatum(opts.twophase ?
> - LOGICALREP_TWOPHASE_STATE_PENDING :
> - LOGICALREP_TWOPHASE_STATE_DISABLED);
> + CharGetDatum(subtwophase);
>   replaces[Anum_pg_subscription_subtwophasestate - 1] = true;
> 
> Sorry, I don't think that 'subtwophase' change is an improvement --
> IMO the existing ternary code was fine as-is.
> 
> I only reported the excessive flag checking in the previous v10-0003
> review because of some extra "if (!opts.twophase)" code but that was
> caused by what you called "wrong git operations." and is already fixed
> now.

Ok, the part was reverted.

> //////////
> v11-0004
> //////////
> 
> ======
> src/sgml/catalogs.sgml
> 
> 4.1.
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>subforcealter</structfield> <type>bool</type>
> +      </para>
> +      <para>
> +       If true, then the <link
> linkend="sql-altersubscription"><command>ALTER
> SUBSCRIPTION</command></link>
> +       can disable <literal>two_phase</literal> option, even if there are
> +       uncommitted prepared transactions from when
> <literal>two_phase</literal>
> +       was enabled
> +      </para></entry>
> +     </row>
> +
> 
> I think this description should be changed to say what it *really*
> does. IMO, the stuff about 'two_phase' is more like a side-effect.
> 
> SUGGESTION (this is just pseudo-code. You can add the CREATE
> SUBSCRIPTION 'force_alter' link appropriately)
> 
> If true, then the <command>ALTER SUBSCRIPTION</command> command can
> sometimes be forced to proceed instead of giving an error. See
> <link>force_alter</link> parameter for details about when this might
> be useful.
>

Fixed. But One point, the word "command" was removed. I checked other parts and
it seemed not to be needed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment
Hi Kuroda-san,

I did not apply these v12* patches, but I have diff'ed all of them
with the previous v11* patches and confirmed that all of my previous
review comments now seem to be addressed.

I don't have any more comments to make at this time. Thanks!

======
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
> Dear hackers,
> 
> I found that v12 patch set could not be accepted by the cfbot. PSA new version.

To make others more trackable, I shared changes just in case. All failures were occurred
on the pg_dump code. I added an attribute in pg_subscription and modified pg_dump code,
but it was wrong. A constructed SQL became incomplete. I.e., in [1]: 

```
pg_dump: error: query failed: ERROR:  syntax error at or near "."
LINE 15:  s.subforcealter
           ^
pg_dump: detail: Query was: SELECT s.tableoid, s.oid, s.subname,
 s.subowner,
 s.subconninfo, s.subslotname, s.subsynccommit,
 s.subpublications,
 s.subbinary,
 s.substream,
 s.subtwophasestate,
 s.subdisableonerr,
 s.subpasswordrequired,
 s.subrunasowner,
 s.suborigin,
 NULL AS suboriginremotelsn,
 false AS subenabled,
 s.subfailover
 s.subforcealter
FROM pg_subscription s
WHERE s.subdbid = (SELECT oid FROM pg_database
                   WHERE datname = current_database())
```

Based on that I just added a comma in 0004 patch.

[1]: https://cirrus-ci.com/task/6710166165389312

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Apr 22, 2024 at 2:26 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
 ```
>
> It succeeds if force_alter is also expressly set. Prepared transactions will be
> aborted at that time.
>
> ```
> subscriber=# ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);
> ALTER SUBSCRIPTION
>

Isn't it better to give a Notice when force_alter option leads to the
rollback of already prepared transactions?

I have another question on the latest 0001 patch:
+ /*
+ * Stop all the subscription workers, just in case.
+ * Workers may still survive even if the subscription is
+ * disabled.
+ */
+ logicalrep_workers_stop(subid);

In which case the workers will survive when the subscription is disabled?

--
With Regards,
Amit Kapila.



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> >
> > It succeeds if force_alter is also expressly set. Prepared transactions will be
> > aborted at that time.
> >
> > ```
> > subscriber=# ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter =
> on);
> > ALTER SUBSCRIPTION
> >
> 
> Isn't it better to give a Notice when force_alter option leads to the
> rollback of already prepared transactions?

Indeed. I think this can be added for 0003. For now, it says like:

```
postgres=# ALTER SUBSCRIPTION sub SET (TWO_PHASE = off, FORCE_ALTER = on);
WARNING:  requested altering to two_phase = false but there are prepared transactions done by the subscription
DETAIL:  Such transactions are being rollbacked.
ALTER SUBSCRIPTION
```

> I have another question on the latest 0001 patch:
> + /*
> + * Stop all the subscription workers, just in case.
> + * Workers may still survive even if the subscription is
> + * disabled.
> + */
> + logicalrep_workers_stop(subid);
> 
> In which case the workers will survive when the subscription is disabled?

I think both normal and tablesync worker can survive, because ALTER SUBSCRIPTION
DISABLE command does not send signal to workers. It just change the system catalog.
logicalrep_workers_stop() is added to ensure all workers are stopped.

Actually, earlier version (-v3) did not have a mechanism but they sometimes got
assertion failures in maybe_reread_subscription(). This was because the survived
workers read pg_subscription catalog and failed below assertion:

```
    /* two-phase cannot be altered while the worker exists */
    Assert(newsub->twophasestate == MySubscription->twophasestate);
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Thu, Jul 4, 2024 at 1:34 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > >
> > > It succeeds if force_alter is also expressly set. Prepared transactions will be
> > > aborted at that time.
> > >
> > > ```
> > > subscriber=# ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter =
> > on);
> > > ALTER SUBSCRIPTION
> > >
> >
> > Isn't it better to give a Notice when force_alter option leads to the
> > rollback of already prepared transactions?
>
> Indeed. I think this can be added for 0003. For now, it says like:
>
> ```
> postgres=# ALTER SUBSCRIPTION sub SET (TWO_PHASE = off, FORCE_ALTER = on);
> WARNING:  requested altering to two_phase = false but there are prepared transactions done by the subscription
> DETAIL:  Such transactions are being rollbacked.
> ALTER SUBSCRIPTION
>

Is it possible to get a NOTICE instead of a WARNING?

>
> > I have another question on the latest 0001 patch:
> > + /*
> > + * Stop all the subscription workers, just in case.
> > + * Workers may still survive even if the subscription is
> > + * disabled.
> > + */
> > + logicalrep_workers_stop(subid);
> >
> > In which case the workers will survive when the subscription is disabled?
>
> I think both normal and tablesync worker can survive, because ALTER SUBSCRIPTION
> DISABLE command does not send signal to workers. It just change the system catalog.
> logicalrep_workers_stop() is added to ensure all workers are stopped.
>
> Actually, earlier version (-v3) did not have a mechanism but they sometimes got
> assertion failures in maybe_reread_subscription(). This was because the survived
> workers read pg_subscription catalog and failed below assertion:
>
> ```
>         /* two-phase cannot be altered while the worker exists */
>         Assert(newsub->twophasestate == MySubscription->twophasestate);
> ```
>

But that is not a good reason for this operation to stop workers
first. Instead, we should prohibit this operation if any worker is
present. The reason is that there is always a chance that if any
worker is alive, it can prepare a new transaction after we have
checked for the presence of any prepared transactions.

Comments:
=========
1.
There is no need to do something remarkable regarding
+ * the "false" to "true" case; the backend process alters
+ * subtwophase <funny_char> to LOGICALREP_TWOPHASE_STATE_PENDING once.
+ * After the subscription is enabled, a new logical
+ * replication worker requests to change the two_phase
+ * option of its slot from pending to true when the
+ * initial data synchronization is done. The code path is
+ * the same as the case in which two_phase <funny_char> is initially
+ * set <funny_char> to true.

The patch has some funny characters in the above comment at the places
highlighted by me. It seems you have copied from some editor that has
inserted such characters.

2.
/*
* Do not allow toggling of two_phase option. Doing so could cause
* missing of transactions and lead to an inconsistent replica.
* See comments atop worker.c
*
* Note: Unsupported twophase indicates that this call originated
* from AlterSubscription.
*/
if (!IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));

This part of the code must either be removed or converted to an assert.

3. The tests added in 099_twophase_added.pl should be part of 021_twophase.pl

--
With Regards,
Amit Kapila.



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

Thanks for giving comments. I hope all comments have been addressed.
PSA new version.

> > Actually, earlier version (-v3) did not have a mechanism but they sometimes got
> > assertion failures in maybe_reread_subscription(). This was because the
> survived
> > workers read pg_subscription catalog and failed below assertion:
> >
> > ```
> >         /* two-phase cannot be altered while the worker exists */
> >         Assert(newsub->twophasestate ==
> MySubscription->twophasestate);
> > ```
> >
> 
> But that is not a good reason for this operation to stop workers
> first. Instead, we should prohibit this operation if any worker is
> present. The reason is that there is always a chance that if any
> worker is alive, it can prepare a new transaction after we have
> checked for the presence of any prepared transactions.

I used the function because it internally waits until all workers are exited.
But OK, I modified like you suggested (logicalrep_workers_find() is used).

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 

Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

Sorry, I forgot to say one content.

> > But that is not a good reason for this operation to stop workers
> > first. Instead, we should prohibit this operation if any worker is
> > present. The reason is that there is always a chance that if any
> > worker is alive, it can prepare a new transaction after we have
> > checked for the presence of any prepared transactions.
> 
> I used the function because it internally waits until all workers are exited.
> But OK, I modified like you suggested (logicalrep_workers_find() is used).

Based on the reason, after the above modification, test codes prior to v14
sometimes failed because backend could execute ALTER SUBSCRIPTION ... SET (two_phase).
So I added lines in test codes to poll until workers are exited, e.g.,

```
+# Alter subscription two_phase to false
+$node_subscriber->safe_psql('postgres',
+    "ALTER SUBSCRIPTION tap_sub_copy DISABLE;");
+$node_subscriber->poll_query_until('postgres',
+    "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'logical replication worker'"
+);
+$node_subscriber->safe_psql(
+       'postgres', "
+    ALTER SUBSCRIPTION tap_sub_copy SET (two_phase = false);
+    ALTER SUBSCRIPTION tap_sub_copy ENABLE;");
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Vitaly Davydov"
Date:
Hi Kuroda-san,

Thank you very much for the patch. In general, it seem to work well for me, but there seems to be a memory access problem in libpqrcv_alter_slot -> quote_identifier in case of NULL slot_name. It happens, if the two_phase option is altered on a subscription without slot. I think, a simple check for NULL may fix the problem. I guess, the same problem may be for failover option.

Another possible problem is related to my use case. I haven't reproduced this case, just some thoughts. I guess, when two_phase is ON, the PREPARE statement may be truncated from the WAL at checkpoint, but COMMIT PREPARED is still kept in the WAL. On catchup, I would ask the master to send transactions from some restart LSN. I would like to get all such transactions competely, with theirs bodies, not only COMMIT PREPARED messages. One of the solutions is to have an option for the slot to keep the WAL like with two_phase = OFF independently on its two_phase option. It is just an idea.

With best regards,
Vitaly

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vitaly,

Thanks for giving comments! PSA new version patch.

> Thank you very much for the patch. In general, it seem to work well for me, but
> there seems to be a memory access problem in libpqrcv_alter_slot ->
> quote_identifier in case of NULL slot_name. It happens, if the two_phase option
> is altered on a subscription without slot. I think, a simple check for NULL may
> fix the problem. I guess, the same problem may be for failover option.

You are right. Regarding the failover option, it requires that slot_name is valid.
In case of two_phase, we must connect to the publisher only when altering "true"
to "false", slot_name must be there only at that time. Updated.

> Another possible problem is related to my use case. I haven't reproduced this
> case, just some thoughts. I guess, when two_phase is ON, the PREPARE statement
> may be truncated from the WAL at checkpoint, but COMMIT PREPARED is still kept
> in the WAL. On catchup, I would ask the master to send transactions from some
> restart LSN. I would like to get all such transactions competely, with theirs
> bodies, not only COMMIT PREPARED messages.

I don't think it is a real issue. WALs for prepared transactions will retain
until they are committed/aborted.
When the two_phase is on and transactions are PREPAREd, they will not be
cleaned up from the memory (See ReorderBufferProcessTXN()).  Then, RUNNING_XACT
record leads to update the restart_lsn of the slot but it cannot be move forward
because ReorderBufferGetOldestTXN() returns the prepared transaction (See
SnapBuildProcessRunningXacts()). restart_decoding_lsn of each transaction, which
is a candidate of restart_lsn of the slot. is always behind the startpoint of
its txn.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Jul 8, 2024 at 12:34 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > Another possible problem is related to my use case. I haven't reproduced this
> > case, just some thoughts. I guess, when two_phase is ON, the PREPARE statement
> > may be truncated from the WAL at checkpoint, but COMMIT PREPARED is still kept
> > in the WAL. On catchup, I would ask the master to send transactions from some
> > restart LSN. I would like to get all such transactions competely, with theirs
> > bodies, not only COMMIT PREPARED messages.
>
> I don't think it is a real issue. WALs for prepared transactions will retain
> until they are committed/aborted.
> When the two_phase is on and transactions are PREPAREd, they will not be
> cleaned up from the memory (See ReorderBufferProcessTXN()).  Then, RUNNING_XACT
> record leads to update the restart_lsn of the slot but it cannot be move forward
> because ReorderBufferGetOldestTXN() returns the prepared transaction (See
> SnapBuildProcessRunningXacts()). restart_decoding_lsn of each transaction, which
> is a candidate of restart_lsn of the slot. is always behind the startpoint of
> its txn.
>

I see that in 0003/0004, the patch first aborts pending prepared
transactions, update's catalog, and then change slot's property via
walrcv_alter_slot. What if there is any ERROR (say the remote node is
not reachable or there is an error while updating the catalog) after
we abort the pending prepared transaction? Won't we end up with lost
prepared transactions in such a case?

Few other comments:
=================
The code to handle SUBOPT_TWOPHASE_COMMIT should be after failover
option handling for the sake of code symmetry. Also, the checks should
be in same order like first for slot_name, then enabled, then for
PreventInTransactionBlock(), after those, we can have other checks for
two_phase. If possible, we can move common checks in both failover and
two_phase options into a common function.

What should be the behavior if one tries to set slot_name to NONE and
also tries to toggle two_pahse option? I feel both options together
don't makes sense because there is no use in changing two_phase for
some slot which we are disassociating the subscription from. The same
could be said for the failover option as well, so if we agree with
some different behavior here, we can follow the same for failover
option as well.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Jul 8, 2024 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> I see that in 0003/0004, the patch first aborts pending prepared
> transactions, update's catalog, and then change slot's property via
> walrcv_alter_slot. What if there is any ERROR (say the remote node is
> not reachable or there is an error while updating the catalog) after
> we abort the pending prepared transaction? Won't we end up with lost
> prepared transactions in such a case?
>

Considering the above is a problem the other possibility I thought of
is to change the order like abort prepared xacts after slot update.
That is also dangerous because any failure while aborting could make a
slot change permanent whereas the subscription option will still be
old value. Now, because the slot's two_phase property is off, at
commit, it can resend the entire transaction which can create a
problem because the corresponding prepared transaction will already be
present.

One more thing to think about in this regard is what if we fail after
aborting a few prepared transactions and not all?

At this stage, I am not able to think of a good solution for these
problems. So, if we don't get a solution for these, we can document
that users can first manually abort prepared transactions and then
switch off the two_phase option using Alter Subscription command.

--
With Regards,
Amit Kapila.



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

Thanks for giving comments! Here I wanted to reply one of comments.

> What should be the behavior if one tries to set slot_name to NONE and
> also tries to toggle two_pahse option?

You mentioned like below case, right?

```
ALTER SUBSCRIPTION sub SET (two_phase = false, slot_name = NONE);
```

For now, we accept such a command. The replication slot which previously specified
is altered. As you know, this behavior is same as failover's one.

> I feel both options together
> don't makes sense because there is no use in changing two_phase for
> some slot which we are disassociating the subscription from. The same
> could be said for the failover option as well, so if we agree with
> some different behavior here, we can follow the same for failover
> option as well.

While considering more, I started to think the combination of slot_name and
two_phase should not be allowed. Even if both of them are altered at the same time,
the *old* slot will be modified by the backend process. I feel this inconsistency
should not be happened. In next patch, this check will be added. I also think
failover option should be also fixed, but not touched here. Let's make the scope
narrower.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> > I see that in 0003/0004, the patch first aborts pending prepared
> > transactions, update's catalog, and then change slot's property via
> > walrcv_alter_slot. What if there is any ERROR (say the remote node is
> > not reachable or there is an error while updating the catalog) after
> > we abort the pending prepared transaction? Won't we end up with lost
> > prepared transactions in such a case?

Yes, v16 could happen the case, becasue FinishPreparedTransaction() itself is not
the transactional operation. In below example, the subscription was altered after
stopping the publisher. You could see that prepared transaction were rollbacked.

```
subscriber=# SELECT gid FROM pg_prepared_xacts ;
       gid        
------------------
 pg_gid_16390_741
 pg_gid_16390_742
(2 rows)
subscriber=# ALTER SUBSCRIPTION sub SET (TWO_PHASE = off, FORCE_ALTER = on);
NOTICE:  requested altering to two_phase = false but there are prepared transactions done by the subscription
DETAIL:  Such transactions are being rollbacked.
ERROR:  could not connect to the publisher: connection to server on socket "/tmp/.s.PGSQL.5431" failed: No such file or
directory
        Is the server running locally and accepting connections on that socket?
subscriber=# SELECT gid FROM pg_prepared_xacts ;
 gid 
-----
(0 rows)
```

> Considering the above is a problem the other possibility I thought of
> is to change the order like abort prepared xacts after slot update.
> That is also dangerous because any failure while aborting could make a
> slot change permanent whereas the subscription option will still be
> old value. Now, because the slot's two_phase property is off, at
> commit, it can resend the entire transaction which can create a
> problem because the corresponding prepared transaction will already be
> present.

I feel it is rare case but still possible. E.g., race condition by TwoPhaseStateLock
locking, oom, disk failures and so on.
And since prepared transactions hold locks, duplicated arrival of transactions
may cause table-lock failures. 

> One more thing to think about in this regard is what if we fail after
> aborting a few prepared transactions and not all?

It's bit hard to emulate, but I imagine part of prepared transactions remains.

> At this stage, I am not able to think of a good solution for these
> problems. So, if we don't get a solution for these, we can document
> that users can first manually abort prepared transactions and then
> switch off the two_phase option using Alter Subscription command.

I'm also not sure what should we do. Ideally, it may be happy to make
FinishPreparedTransaction() transactional, but not sure it is realistic. So
changes for aborting prepared txns are removed, documentation patch was added
instead.

Here is a summary of updates for patches. Dropping-prepared-transaction patch
was removed for now.

0001 - Codes for SUBOPT_TWOPHASE_COMMIT are moved per requirement [1].
       Also, checks for failover and two_phase are unified into one function.
0002 - updated accordingly. An argument for the check function is added.
0003 - this contains documentation changes required in [2].

[1]: https://www.postgresql.org/message-id/CAA4eK1%2BFRrL_fLWLsWQGHZRESg39ixzDX_S9hU8D7aFtU%2Ba8uQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1Khy_YWFoQ1HOF_tGtiixD8YoTg86coX1-ckxt8vK3U%3DQ%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
> 0001 - Codes for SUBOPT_TWOPHASE_COMMIT are moved per requirement [1].
>        Also, checks for failover and two_phase are unified into one function.
> 0002 - updated accordingly. An argument for the check function is added.
> 0003 - this contains documentation changes required in [2].

Previous patch set could not be accepted due to the initialization miss.
PSA new version. 

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Zhijie Hou (Fujitsu)"
Date:
On Tuesday, July 9, 2024 8:53 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote:
> 
> > 0001 - Codes for SUBOPT_TWOPHASE_COMMIT are moved per requirement
> [1].
> >        Also, checks for failover and two_phase are unified into one function.
> > 0002 - updated accordingly. An argument for the check function is added.
> > 0003 - this contains documentation changes required in [2].
> 
> Previous patch set could not be accepted due to the initialization miss.
> PSA new version.

Thanks for the patches ! I initially reviewed the 0001 and found that
the implementation of ALTER_REPLICATION_SLOT has a issue, e.g.
it doesn't handle the case when there is only one specified option
in the replication command:

ALTER_REPLICATION_SLOT slot (two_phase)

In this case, it always overwrites the un-specified option(failover) to false even
when the failover was set to true. I tried to make a small fix which is on
top of 0001 (please see the attachment).

I also added the doc of the new two_phase option of the replication command
and a missing period of errhint in the topup patch.

Best Regards,
Hou zj

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Jul 9, 2024 at 6:23 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Previous patch set could not be accepted due to the initialization miss.
> PSA new version.
>

Few minor comments:
=================
0001-patch
1.
.git/rebase-apply/patch:253: space before tab in indent.

errmsg("slot_name and two_phase cannot be altered at the same
time")));
warning: 1 line adds whitespace errors.

White space issue as shown by git am command.

2.
+/*
+ * Common checks for altering failover and two_phase option
+ */
+static void
+CommonChecksForFailoverAndTwophase(Subscription *sub, const char *option,
+    bool isTopLevel)

The function name looks odd to me. How about something along the lines
of CheckAlterSubOption()?

3.
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot disable two_phase when uncommitted prepared
transactions present"),

We can slightly change the above error message to: "cannot disable
two_phase when prepared transactions are present".

0003-patch
Alter the altering from
+      <literal>true</literal> to <literal>false</literal>, the publisher will
+      replicate transactions again when they are committed.

The beginning of the sentence sounds awkward.


--
With Regards,
Amit Kapila.



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, Hou,

Thanks for giving comments! PSA new versions.
What's new:

0001: included Hou's patch [1] not to overwrite slot options.
      Some other comments were also addressed.
0002: not so changed, just rebased.
0003: Typo was fixed, s/Alter/After/.

[1]:
https://www.postgresql.org/message-id/OS3PR01MB57184E0995521300AC06CB4B94A72%40OS3PR01MB5718.jpnprd01.prod.outlook.com
 

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi, here are some review comments for patch v18-0001.

======
doc/src/sgml/protocol.sgml

nitpick - Although it is no fault of your patch, IMO it would be nicer for the TWO_PHASE description (of CREATE REPLICATION SLOT) to also be in the same consistent order as what you have (e.g. below FAILOVER). So I moved it.

======
src/backend/access/transam/twophase.c    

LookupGXactBySubid:
nitpick - add a blank line before return

======
src/backend/commands/subscriptioncmds.c  

CommonChecksForFailoverAndTwophase:
nitpick - added Assert for the generic-looking "option" parameter name
nitpick - modified comment about transaction block

~~~

1. AlterSubscription
+ * Workers may still survive even if the subscription has
+ * been disabled. They may read the pg_subscription
+ * catalog and detect that the twophase parameter is
+ * updated, which causes the assertion failure. Ensure
+ * workers have already been exited to avoid it.

"which causes the assertion failure" -- what assertion failure is that? The comment is not very clear.

~

nitpick - in comment /twophase/two_phase/
nitpick - typo /acoordingly/accordingly/

======
src/backend/replication/logical/launcher.c

logicalrep_workers_find:
nitpick - /require_lock/acquire_lock/
nitpick - take the Assert out of the else.

======
src/backend/replication/slot.c            

nitpick - refactor the code to check (failover) only one time. See the nitpicks attachment.

~

2. ParseAlterReplSlotOptions

nitpick -- IMO the ParseAlterReplSlotOptions(). function does more harm than good here by adding the unnecessary complexity of messing around with multiple parameters that are passed-by-reference. All this would be simpler if it was just coded inline in the AlterReplicationSlot() function, which is the only caller. I've refactored all this to demonstrate (see nitpicks attachment)

======
src/include/replication/worker_internal.h

nitpick - /require_lock/acquire_lock/

======
src/test/regress/sql/subscription.sql    

nitpick - tweak comments

======
src/test/subscription/t/021_twophase.pl  

nitpick - change comment style to indicate each test part better.

======
99.
Please also see the attached diffs patch which implements any nitpicks mentioned above.

======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Zhijie Hou (Fujitsu)"
Date:
On Tuesday, July 16, 2024 1:17 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote
> 
> Dear Amit, Hou,
> 
> Thanks for giving comments! PSA new versions.
> What's new:
> 
> 0001: included Hou's patch [1] not to overwrite slot options.
>       Some other comments were also addressed.

Thanks for the patch!

One more issue I found is that:

+IsTwoPhaseTransactionGidForSubid(Oid subid, char *gid)
+{
+    int            ret;
+    Oid            subid_written;
+    TransactionId xid;
+
+    ret = sscanf(gid, "pg_gid_%u_%u", &subid_written, &xid);
+
+    return (ret == 2 && subid == subid_written);

I think it's not correct to use sscanf here, because it will return the same value
even if the gid is "pg_gid_123_123_123_123..." which isn't a
gid created by the apply worker. I think we should use TwoPhaseTransactionGid
to build the gid string and compare it with each existing gid(strcmp).

Best Regards,
Hou zj





Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Here are some review comments for patch v18-0002.

======
src/backend/commands/subscriptioncmds.c

1. CheckAlterSubOption

1a.
It's not obvious why we are only checking the 'slot name' when needs_update==true, but OTOH is always checking the  'enabled' state.

~

1b.
Param 'needs_update' is a vague name. It needs more explanatory comments or a better name. e.g. First impression was "Why are we calling 'Alter' function if needs_update is false?". I know it encapsulates some common code, but if special cases cause the logic to be more confusing then that cost may outweigh the benefit of this function.

~

1c.
If the error checks can be moved to be done up-front, then all the 'needs_update' can be combined. Avoiding multiple checks to 'needs_update' will make this function simpler.
 
~~~

AlterSubscription:
nitpick - typo /needs/need/
nitpick - typo /wo_phase/two_phase/
nitpick - The comment wording "the later part...", was confusing. I've reworded the whole comment. But this belongs in patch 0001.

======
src/test/subscription/t/021_twophase.pl

nitpick - Use the same "###############################" comment style as in patch 0001 to indicate each main TEST scenario.

======
99.
Please refer to the diffs attachment patch, which implements any nitpicks mentioned above.

======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi, here is my review of the v18-0003 patch.

======
sgml/ref/alter_subscription.sgml

nitpick - some minor tweaks to the documentation text. I also added a link back to the two_phase parameter. Please see the attached diffs file.

======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Hou, Peter,

Thanks for giving comments! PSA new version.
Almost comments were addressed.
What's new:
0001 - IsTwoPhaseTransactionGidForSubid() was updated per comment from Hou-san [1].
          Some nitpicks were accepted.
0002 - An argument in CheckAlterSubOption() was renamed to " slot_needs_update "
          Some nitpicks were accepted.
0003 - Some nitpicks were accepted.

Below part contains the reason why I rejected some comments.

> CommonChecksForFailoverAndTwophase:
> nitpick - added Assert for the generic-looking "option" parameter name

The style looks strange for me, using multiple strcmp() is more straightforward.
Added like that.

> 1c.
> If the error checks can be moved to be done up-front, then all the 'needs_update'
> can be combined. Avoiding multiple checks to 'needs_update' will make this function simpler.

This style was needed to preserve error condition for failover option. Not changed.

[1]:
https://www.postgresql.org/message-id/OS3PR01MB571834FBD3E6D3804484038F94A32%40OS3PR01MB5718.jpnprd01.prod.outlook.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi, here are my review comments for v19-0001.

======
doc/src/sgml/protocol.sgml

nitpick - Now there is >1 option. /The following option is supported:/The following options are supported:/

======
src/backend/access/transam/twophase.c

TwoPhaseTransactionGid:
nitpick - renamed parameter /gid/gid_res/ to emphasize that this is returned by reference

~~~

1.
IsTwoPhaseTransactionGidForSubid
+ /* Construct the format GID based on the got xid */
+ TwoPhaseTransactionGid(subid, xid, gid_generated, sizeof(gid));

I think the wrong size is being passed here. It should be the buffer size -- e.g. sizeof(gid_generated).

~

nitpick - renamed a couple of vars for readability
nitpick - expanded some comments.

======
src/backend/commands/subscriptioncmds.c

2. AlterSubscription
+ /*
+ * slot_name and two_phase cannot be altered
+ * simultaneously. The latter part refers to the pre-set
+ * slot name and tries to modify the slot option, so
+ * changing both does not make sense.
+ */

I had given a v18-0002 nitpick suggestion to re-word all of this comment. But, as I wrote before [1], that fix belongs here in patch 0001 where the comment was first added.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPsqMRS3dcijo5jsTSbgV1-9So-YBC7PH7xg0%2BZ8hA7fDQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi, here are my review comments for patch v19-0002.

======
src/backend/commands/subscriptioncmds.c

CheckAlterSubOption:
nitpick - tweak some comment wording

~

On Wed, Jul 17, 2024 at 3:13 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > 1c.
> > If the error checks can be moved to be done up-front, then all the 'needs_update'
> > can be combined. Avoiding multiple checks to 'needs_update' will make this function simpler.
>
> This style was needed to preserve error condition for failover option. Not changed.
>

nitpick - Hmm. I think you might be trying to preserve the ordering of
errors when that order is of no consequence. AFAICT which error comes
first here is neither documented nor regression tested. e.g.
reordering them gives no problem for testing, but OTOH reordering them
does simplify the code. Anyway, I have modified the code in my
attached nitpicks diffs to demonstrate this suggestion in case you
change your mind.

~~~

AlterSubscription:
nitpick - let's keep all the variables called 'update_xxx' together.
nitpick - comment typo /needs/need/
nitpick - tweak some comment wording

======
src/test/subscription/t/021_twophase.pl

nitpick - didn't quite understand the "Since we are..." comment. I
reworded it according to what I thought was the intention.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi Kuroda-San, here are some review comment for patch v19-00001

======
doc/src/sgml/ref/alter_subscription.sgml

The previous patches have common failover/two_phase code checking for
"Do not allow changing the option if the subscription is enabled", but
it seems the docs were mentioning that only for "two_phase" and not
for "failover".

I'm not 100% sure if mentioning about disabled was necessary, but
certainly it should be all-or-nothing, not just saying it for one of
the parameters. Anyway, I chose to add the missing info. Please see
the attached nitpicks diff.

======
Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Peter,

Thanks for giving comments! PSA new version.
I think most of comments were addressed, and I ran pgindent/pgperltidy again.

Regarding the CheckAlterSubOption(), the ordering is still preserved
because I preferred to keep some specs. But I can agree that yours
make codes simpler.

BTW, I started to think patches can be merged in future versions because
they must be included at once and codes are not so long. Thought?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Zhijie Hou (Fujitsu)"
Date:
On Thursday, July 18, 2024 10:11 AM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
>
> Dear Peter,
>
> Thanks for giving comments! PSA new version.

I did a few more tests and analysis and didn't find issues. Just share the
cases I tested:

1. After manually rolling back xacts for two_pc and switch two_pc option from
   true to false, does the prepared transaction again get replicated again when
   COMMIT PREPARED happens.

It work as expected in this case. E.g. the transaction will be sent to
subscriber after disabling two_pc.

And I think there wouldn't be race conditions between the ALTER command
and apply worker because user needs to disable the subscription(the apply
worker will stop) before altering the two_phase the option.

And the WALs for the prepared transaction is retained until the COMMIT
PREPARED, because we don't advance the slot's restart_lsn over the ongoing
transactions(e.g. the prepared transaction in this case):

SnapBuildProcessRunningXacts
...
    txn = ReorderBufferGetOldestTXN(builder->reorder);
    ...
    /*
     * oldest ongoing txn might have started when we didn't yet serialize
     * anything because we hadn't reached a consistent state yet.
     */
    if (txn != NULL && txn->restart_decoding_lsn != InvalidXLogRecPtr)
        LogicalIncreaseRestartDecodingForSlot(lsn, txn->restart_decoding_lsn);

So, the data of the prepared transaction is safe.

2. Test when prepare is already processed but we alter the option false to
   true.

This case works as expected as well e.g. the whole transaction will be sent to the
subscriber on COMMIT PREPARE using two_pc flow:

"begin prepare" -> "txn data" -> "prepare" -> "commit prepare"

Due to the same reason in case 1, there is no concurrency issue and the
data of the transaction will be retained until COMMIT PREPARED.

Best Regards,
Hou zj




Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Thu, Jul 18, 2024 at 7:40 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Regarding the CheckAlterSubOption(), the ordering is still preserved
> because I preferred to keep some specs. But I can agree that yours
> make codes simpler.
>

It is better to simplify the code in this case. I have taken care of
this in the attached.

> BTW, I started to think patches can be merged in future versions because
> they must be included at once and codes are not so long. Thought?
>

I agree and have done that in the attached. I have made some
additional changes: (a) removed the unrelated change of two_phase in
protocol.sgml, (b) tried to make the two_phase change before failover
option wherever it makes sense to keep the code consistent, (c)
changed/added comments and doc changes at various places.

I'll continue my review and testing of the patch but I thought of
sharing what I have done till now.

--
With Regards,
Amit Kapila.

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
On Thu, Jul 18, 2024 at 9:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
> I agree and have done that in the attached. I have made some
> additional changes: (a) removed the unrelated change of two_phase in
> protocol.sgml, (b) tried to make the two_phase change before failover
> option wherever it makes sense to keep the code consistent, (c)
> changed/added comments and doc changes at various places.
>
> I'll continue my review and testing of the patch but I thought of
> sharing what I have done till now.
>

Here some minor comments for patch v21

======
You wrote "tried to make the two_phase change before failover option
wherever it makes sense to keep the code consistent". But, still
failover is coded first in lots of places:
- libpqrcv_alter_slot
- ReplicationSlotAlter
- AlterReplicationSlot
etc.

Q. Why not change those ones?

======
src/backend/access/transam/twophase.c

IsTwoPhaseTransactionGidForSubid:
nitpick - nicer to rename the temporary gid variable: /gid_generated/gid_tmp/

======
src/backend/commands/subscriptioncmds.c

CheckAlterSubOption:
nitpick = function comment period/plural.
nitpick - typo /Samilar/Similar/

======
src/include/replication/slot.h

1.
-extern void ReplicationSlotAlter(const char *name, bool failover);
+extern void ReplicationSlotAlter(const char *name, bool *failover,
+ bool *two_phase);

Use const?

======
99.
Please see attached diffs implementing the nitpicks mentioned above

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Fri, Jul 19, 2024 at 8:06 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> ======
> You wrote "tried to make the two_phase change before failover option
> wherever it makes sense to keep the code consistent". But, still
> failover is coded first in lots of places:
> - libpqrcv_alter_slot
> - ReplicationSlotAlter
> - AlterReplicationSlot
> etc.
>

In ReplicationSlotAlter(), there are error conditions related to
standby and failover slots which are better checked before setting
two_phase property. The main reason for keeping two_phase before the
failover option in subscriptioncmds.c is that SUBOPT_TWOPHASE_COMMIT
was introduced before the equivalent failover option. We can do at
other places as you pointed but I didn't see any compelling reason to
not do what we normally do which is to add the new options at the end.

> ======
> src/include/replication/slot.h
>
> 1.
> -extern void ReplicationSlotAlter(const char *name, bool failover);
> +extern void ReplicationSlotAlter(const char *name, bool *failover,
> + bool *two_phase);
>
> Use const?
>

If so, we need to use const both for failover and two_phase but not
sure if that is required here. We can evaluate that separately if
required by comparing it with similar instances.

> ======
> 99.
> Please see attached diffs implementing the nitpicks mentioned above
>

These look good to me, so will incorporate them in the next patch.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Fri, Jul 19, 2024 at 10:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > ======
> > src/include/replication/slot.h
> >
> > 1.
> > -extern void ReplicationSlotAlter(const char *name, bool failover);
> > +extern void ReplicationSlotAlter(const char *name, bool *failover,
> > + bool *two_phase);
> >
> > Use const?
> >
>
> If so, we need to use const both for failover and two_phase but not
> sure if that is required here. We can evaluate that separately if
> required by comparing it with similar instances.
>

I checked and found that the patch uses const in walrcv_alter_slot_fn,
so agree that we can change to const here as well.

--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Thu, Jul 18, 2024 at 5:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I'll continue my review and testing of the patch but I thought of
> sharing what I have done till now.
>

+ /*
+ * Do not allow changing the option if the subscription is enabled. This
+ * is because both failover and two_phase options of the slot on the
+ * publisher cannot be modified if the slot is currently acquired by the
+ * existing walsender.
+ */
+ if (sub->enabled)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for enabled subscription",
+ option)));

As per my understanding, the above comment is not true when we are
changing 'two_phase' option from 'false' to 'true' because in that
case, the existing walsender will only change it. So, ideally, we can
allow toggling two_phase from 'false' to 'true' without the above
restriction.

If this is correct then we don't even need to error for the case
"cannot alter two_phase when logical replication worker is still
running" when 'two_phase' option is changed from 'false' to 'true'.

Now, assuming the above observations are correct, we may still want to
have the same behavior when toggling two_phase option but we can at
least note down that in the comments so that if required the same can
be changed when toggling 'two_phase' option from 'false' to 'true' in
future.

Thoughts?

--
With Regards,
Amit Kapila.



On Thu, 18 Jul 2024 at 07:41, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Peter,
>
> Thanks for giving comments! PSA new version.

Couple of suggestions:
1) How will user know which all transactions should be rolled back
since the prepared transaction name will be different in subscriber
like pg_gid_16398_750, can we mention some info on how user can
identify these prepared transactions that should be rolled back in the
subscriber or if this information is already available can we point it
from here:
+      When altering <link
linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
+      from <literal>true</literal> to <literal>false</literal>, the backend
+      process reports and an error if any prepared transactions done by the
+      logical replication worker (from when <literal>two_phase</literal>
+      parameter was still <literal>true</literal>) are found. You can resolve
+      prepared transactions on the publisher node, or manually roll back them
+      on the subscriber, and then try again.

2)  I'm not sure if InvalidRepOriginId is correct here,  how about
using OidIsValid in the below:
+void
+TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid)
+{
+       Assert(subid != InvalidRepOriginId);

Regards,
Vignesh



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> + /*
> + * Do not allow changing the option if the subscription is enabled. This
> + * is because both failover and two_phase options of the slot on the
> + * publisher cannot be modified if the slot is currently acquired by the
> + * existing walsender.
> + */
> + if (sub->enabled)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot set %s for enabled subscription",
> + option)));
> 
> As per my understanding, the above comment is not true when we are
> changing 'two_phase' option from 'false' to 'true' because in that
> case, the existing walsender will only change it. So, ideally, we can
> allow toggling two_phase from 'false' to 'true' without the above
> restriction.

Hmm, yes. In "false" -> "true" case, the parameter of the slot is not changed by
the backend process. In this case, the subtwophasestate is changed to PENDING
once, then the walsender will change to ENABLED based on the worker requests.

> If this is correct then we don't even need to error for the case
> "cannot alter two_phase when logical replication worker is still
> running" when 'two_phase' option is changed from 'false' to 'true'.

Basically right, one note is that there is an Assert in maybe_reread_subscription(),
it should be also modified.

> Now, assuming the above observations are correct, we may still want to
> have the same behavior when toggling two_phase option but we can at
> least note down that in the comments so that if required the same can
> be changed when toggling 'two_phase' option from 'false' to 'true' in
> future.
> 
> Thoughts?

+1 to add comments in CheckAlterSubOption(). How about the below draft?

```
@@ -1089,6 +1089,12 @@ CheckAlterSubOption(Subscription *sub, const char *option,
      * is because both failover and two_phase options of the slot on the
      * publisher cannot be modified if the slot is currently acquired by the
      * existing walsender.
+     *
+     * XXX: when toggling two_phase from "false" to "true", the slot parameter
+     * is not modified by the backend process so that the lock conflict won't
+     * occur. The restarted walsender will do the alternation. Therefore, we
+     * can allow to switch without the restriction. This can be changed in
+     * the future based on the requirement.
```


Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Sat, Jul 20, 2024 at 9:31 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 18 Jul 2024 at 07:41, Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Peter,
> >
> > Thanks for giving comments! PSA new version.
>
> Couple of suggestions:
> 1) How will user know which all transactions should be rolled back
> since the prepared transaction name will be different in subscriber
> like pg_gid_16398_750, can we mention some info on how user can
> identify these prepared transactions that should be rolled back in the
> subscriber or if this information is already available can we point it
> from here:
> +      When altering <link
> linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
> +      from <literal>true</literal> to <literal>false</literal>, the backend
> +      process reports and an error if any prepared transactions done by the
> +      logical replication worker (from when <literal>two_phase</literal>
> +      parameter was still <literal>true</literal>) are found. You can resolve
> +      prepared transactions on the publisher node, or manually roll back them
> +      on the subscriber, and then try again.
>

I agree it is better to add information about this.

> 2)  I'm not sure if InvalidRepOriginId is correct here,  how about
> using OidIsValid in the below:
> +void
> +TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid)
> +{
> +       Assert(subid != InvalidRepOriginId);
>

I agree with this point but please note that this patch moves this
function so that it can be used from other places. Also, I think it is
wrong to use InvalidRepOriginId as we are passing here
subscription_oid, so, ideally, we should use InvalidOid but I would
rather prefer OidIsValid() as you suggested.


--
With Regards,
Amit Kapila.



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Jul 22, 2024 at 8:26 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> ```
> @@ -1089,6 +1089,12 @@ CheckAlterSubOption(Subscription *sub, const char *option,
>       * is because both failover and two_phase options of the slot on the
>       * publisher cannot be modified if the slot is currently acquired by the
>       * existing walsender.
> +     *
> +     * XXX: when toggling two_phase from "false" to "true", the slot parameter
> +     * is not modified by the backend process so that the lock conflict won't
> +     * occur. The restarted walsender will do the alternation. Therefore, we
> +     * can allow to switch without the restriction. This can be changed in
> +     * the future based on the requirement.
> ```
>
>

I used a slightly different comment in the attached. Apart from this,
I also addressed comments by Vignesh and Peter. Let me know if I
missed anything.

--
With Regards,
Amit Kapila.

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Peter Smith
Date:
Hi, Patch v22-0001 LGTM apart from the following nitpicks

======
src/sgml/ref/alter_subscription.sgml

nitpick - /one needs to/you need to/

======
src/backend/commands/subscriptioncmds.c

CheckAlterSubOption:
nitpick = "ideally we could have..." doesn't make sense because the
code uses a more consistent/simpler way. So other option was not ideal
after all.

AlterSubscription
nitpick - typo /syncronization/synchronization/
nipick - plural fix

======
Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Mon, Jul 22, 2024 at 2:48 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi, Patch v22-0001 LGTM apart from the following nitpicks
>

I have included these in the attached. The patch looks good to me. I
am planning to push this tomorrow unless there are more comments.

--
With Regards,
Amit Kapila.

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Jul 23, 2024 at 4:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 2:48 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi, Patch v22-0001 LGTM apart from the following nitpicks
> >
>
> I have included these in the attached. The patch looks good to me. I
> am planning to push this tomorrow unless there are more comments.
>

I merged these changes, made a few other cosmetic changes, and pushed the patch.

--
With Regards,
Amit Kapila.



Amit Kapila <amit.kapila16@gmail.com> writes:
> I merged these changes, made a few other cosmetic changes, and pushed the patch.

There is a CF entry pointing at this thread [1].  Should it be closed?

            regards, tom lane

[1] https://commitfest.postgresql.org/48/4867/



Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > I merged these changes, made a few other cosmetic changes, and pushed the patch.
>
> There is a CF entry pointing at this thread [1].  Should it be closed?
>

Yes, closed now. Thanks for the reminder.

--
With Regards,
Amit Kapila.



On Thu, 25 Jul 2024 at 08:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > I merged these changes, made a few other cosmetic changes, and pushed the patch.
> >
> > There is a CF entry pointing at this thread [1].  Should it be closed?
> >
>
> Yes, closed now. Thanks for the reminder.

I noticed one random test failure in my environment with 021_twophase test.
[10:37:01.131](0.053s) ok 24 - should be no prepared transactions on subscriber
error running SQL: 'psql:<stdin>:2: ERROR:  cannot alter two_phase
when logical replication worker is still running
HINT:  Try again after some time.'

We can reproduce the issue by adding a delay at apply_worker_exit like
in the attached Reproduce_random_021_twophase_test_failure.patch
patch.

This is happening because the check here is wrong:
+$node_subscriber->poll_query_until('postgres',
+   "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type =
'logical replication worker'"

Here "logical replication worker" should be "logical replication apply worker".

Attached patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From
Amit Kapila
Date:
On Tue, Jul 30, 2024 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 25 Jul 2024 at 08:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > I merged these changes, made a few other cosmetic changes, and pushed the patch.
> > >
> > > There is a CF entry pointing at this thread [1].  Should it be closed?
> > >
> >
> > Yes, closed now. Thanks for the reminder.
>
> I noticed one random test failure in my environment with 021_twophase test.
> [10:37:01.131](0.053s) ok 24 - should be no prepared transactions on subscriber
> error running SQL: 'psql:<stdin>:2: ERROR:  cannot alter two_phase
> when logical replication worker is still running
> HINT:  Try again after some time.'
>
> We can reproduce the issue by adding a delay at apply_worker_exit like
> in the attached Reproduce_random_021_twophase_test_failure.patch
> patch.
>
> This is happening because the check here is wrong:
> +$node_subscriber->poll_query_until('postgres',
> +   "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type =
> 'logical replication worker'"
>
> Here "logical replication worker" should be "logical replication apply worker".
>
> Attached patch has the changes for the same.
>

LGTM.

--
With Regards,
Amit Kapila.