Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id 75210c4351f6c9e4693e1d9ce78749fe@oss.nttdata.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers, take 2  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: Transactions involving multiple postgres foreign servers, take 2
List pgsql-hackers
> On 2020-07-17 15:55, Masahiko Sawada wrote:
>> On Fri, 17 Jul 2020 at 11:06, Masahiro Ikeda 
>> <ikedamsh(at)oss(dot)nttdata(dot)com>
>> wrote:
>>> 
>>> On 2020-07-16 13:16, Masahiko Sawada wrote:
>>>> On Tue, 14 Jul 2020 at 17:24, Masahiro Ikeda 
>>>> <ikedamsh(at)oss(dot)nttdata(dot)com>
>>>> wrote:
>>>>> 
>>>>>> I've attached the latest version patches. I've incorporated the 
>>>>>> review
>>>>>> comments I got so far and improved locking strategy.
>>>>> 
>>>>> I want to ask a question about streaming replication with 2PC.
>>>>> Are you going to support 2PC with streaming replication?
>>>>> 
>>>>> I tried streaming replication using v23 patches.
>>>>> I confirm that 2PC works with streaming replication,
>>>>> which there are primary/standby coordinator.
>>>>> 
>>>>> But, in my understanding, the WAL of "PREPARE" and
>>>>> "COMMIT/ABORT PREPARED" can't be replicated to the standby server 
>>>>> in
>>>>> sync.
>>>>> 
>>>>> If this is right, the unresolved transaction can be occurred.
>>>>> 
>>>>> For example,
>>>>> 
>>>>> 1. PREPARE is done
>>>>> 2. crash primary before the WAL related to PREPARE is
>>>>>     replicated to the standby server
>>>>> 3. promote standby server // but can't execute "ABORT PREPARED"
>>>>> 
>>>>> In above case, the remote server has the unresolved transaction.
>>>>> Can we solve this problem to support in-sync replication?
>>>>> 
>>>>> But, I think some users use async replication for performance.
>>>>> Do we need to document the limitation or make another solution?
>>>>> 
>>>> 
>>>> IIUC with synchronous replication, we can guarantee that WAL records
>>>> are written on both primary and replicas when the client got an
>>>> acknowledgment of commit. We don't replicate each WAL records
>>>> generated during transaction one by one in sync. In the case you
>>>> described, the client will get an error due to the server crash.
>>>> Therefore I think the user cannot expect WAL records generated so 
>>>> far
>>>> has been replicated. The same issue could happen also when the user
>>>> executes PREPARE TRANSACTION and the server crashes.
>>> 
>>> Thanks! I didn't noticed the behavior when a user executes PREPARE
>>> TRANSACTION is same.
>>> 
>>> IIUC with 2PC, there is a different point between (1)PREPARE
>>> TRANSACTION
>>> and (2)2PC.
>>> The point is that whether the client can know when the server crashed
>>> and it's global tx id.
>>> 
>>> If (1)PREPARE TRANSACTION is failed, it's ok the client execute same
>>> command
>>> because if the remote server is already prepared the command will be
>>> ignored.
>>> 
>>> But, if (2)2PC is failed with coordinator crash, the client can't 
>>> know
>>> what operations should be done.
>>> 
>>> If the old coordinator already executed PREPARED, there are some
>>> transaction which should be ABORT PREPARED.
>>> But if the PREPARED WAL is not sent to the standby, the new
>>> coordinator
>>> can't execute ABORT PREPARED.
>>> And the client can't know which remote servers have PREPARED
>>> transactions which should be ABORTED either.
>>> 
>>> Even if the client can know that, only the old coordinator knows its
>>> global transaction id.
>>> Only the database administrator can analyze the old coordinator's log
>>> and then execute the appropriate commands manually, right?
>> 
>> I think that's right. In the case of the coordinator crash, the user
>> can look orphaned foreign prepared transactions by checking the
>> 'identifier' column of pg_foreign_xacts on the new standby server and
>> the prepared transactions on the remote servers.
>> 
> I think there is a case we can't check orphaned foreign
> prepared transaction in pg_foreign_xacts view on the new standby 
> server.
> It confuses users and database administrators.
> 
> If the primary coordinator crashes after preparing foreign transaction,
> but before sending XLOG_FDWXACT_INSERT records to the standby server,
> the standby server can't restore their transaction status and
> pg_foreign_xacts view doesn't show the prepared foreign transactions.
> 
> To send XLOG_FDWXACT_INSERT records asynchronously leads this problem.

If the primary replicates XLOG_FDWXACT_INSERT to the standby 
asynchronously,
some prepared transaction may be unsolved forever.

Since I think to solve this inconsistency manually is hard operation,
we need to support synchronous XLOG_FDWXACT_INSERT replication.

I understood that there are a lot of impact to the performance,
but users can control the consistency/durability vs performance
with synchronous_commit parameter.

What do you think?


> Thank you for letting me know. I've attached the latest version patch 
> set.

Thanks for updating.
But, the latest patches failed to be applied to the master branch.

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Next
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: extension patch of CREATE OR REPLACE TRIGGER