Re: Global snapshots - Mailing list pgsql-hackers
From | Alexey Kondratov |
---|---|
Subject | Re: Global snapshots |
Date | |
Msg-id | 3cdde64facc19a2c1baad1a3993a3b96@postgrespro.ru Whole thread Raw |
In response to | Re: Global snapshots (Bruce Momjian <bruce@momjian.us>) |
List | pgsql-hackers |
On 2020-09-18 00:54, Bruce Momjian wrote: > On Tue, Sep 8, 2020 at 01:36:16PM +0300, Alexey Kondratov wrote: >> Thank you for the link! >> >> After a quick look on the Sawada-san's patch set I think that there >> are two >> major differences: >> >> 1. There is a built-in foreign xacts resolver in the [1], which should >> be >> much more convenient from the end-user perspective. It involves huge >> in-core >> changes and additional complexity that is of course worth of. >> >> However, it's still not clear for me that it is possible to resolve >> all >> foreign prepared xacts on the Postgres' own side with a 100% >> guarantee. >> Imagine a situation when the coordinator node is actually a HA cluster >> group >> (primary + sync + async replica) and it failed just after PREPARE >> stage of >> after local COMMIT. In that case all foreign xacts will be left in the >> prepared state. After failover process complete synchronous replica >> will >> become a new primary. Would it have all required info to properly >> resolve >> orphan prepared xacts? >> >> Probably, this situation is handled properly in the [1], but I've not >> yet >> finished a thorough reading of the patch set, though it has a great >> doc! >> >> On the other hand, previous 0003 and my proposed patch rely on either >> manual >> resolution of hung prepared xacts or usage of external >> monitor/resolver. >> This approach is much simpler from the in-core perspective, but >> doesn't look >> as complete as [1] though. > > Have we considered how someone would clean up foreign transactions if > the > coordinating server dies? Could it be done manually? Would an > external > resolver, rather than an internal one, make this easier? Both Sawada-san's patch [1] and in this thread (e.g. mine [2]) use 2PC with a special gid format including a xid + server identification info. Thus, one can select from pg_prepared_xacts, get xid and coordinator info, then use txid_status() on the coordinator (or ex-coordinator) to get transaction status and finally either commit or abort these stale prepared xacts. Of course this could be wrapped into some user-level support routines as it is done in the [1]. As for the benefits of using an external resolver, I think that there are some of them from the whole system perspective: 1) If one follows the logic above, then this resolver could be stateless, it takes all the required info from the Postgres nodes themselves. 2) Then you can easily put it into container, which make it easier do deploy to all these 'cloud' stuff like kubernetes. 3) Also you can scale resolvers independently from Postgres nodes. I do not think that either of these points is a game changer, but we use a very simple external resolver altogether with [2] in our sharding prototype and it works just fine so far. [1] https://www.postgresql.org/message-id/CA%2Bfd4k4HOVqqC5QR4H984qvD0Ca9g%3D1oLYdrJT_18zP9t%2BUsJg%40mail.gmail.com [2] https://www.postgresql.org/message-id/3ef7877bfed0582019eab3d462a43275%40postgrespro.ru -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
pgsql-hackers by date: