Re: repeated decoding of prepared transactions - Mailing list pgsql-hackers
From | Markus Wanner |
---|---|
Subject | Re: repeated decoding of prepared transactions |
Date | |
Msg-id | 415799ff-89bb-a78e-2f79-7f29834d0460@enterprisedb.com Whole thread Raw |
In response to | Re: repeated decoding of prepared transactions (Ajin Cherian <itsajin@gmail.com>) |
List | pgsql-hackers |
Ajin, Amit, thank you both a lot for thinking this through and even providing a patch. The changes in expectation for twophase.out matches exactly with what I prepared. And the switch with pg_logical_slot_get_changes indeed is something I had not yet considered, either. On 19.02.21 03:50, Ajin Cherian wrote: > For this, I am planning to change the semantics such that > two-phase-commit can only be specified while creating the slot using > pg_create_logical_replication_slot() > and not in pg_logical_slot_get_changes, thus preventing > two-phase-commit flag from being toggled between restarts of the > decoder. Let me know if anybody objects to this > change, else I will update that in the next patch. This sounds like a good plan to me, yes. However, more generally speaking, I suspect you are overthinking this. All of the complexity arises because of the assumption that an output plugin receiving and confirming a PREPARE may not be able to persist that first phase of transaction application. Instead, you are trying to somehow resurrect the transactional changes and the prepare at COMMIT PREPARED time and decode it in a deferred way. Instead, I'm arguing that a PREPARE is an atomic operation just like a transaction's COMMIT. The decoder should always feed these in the order of appearance in the WAL. For example, if you have PREAPRE A, COMMIT B, COMMIT PREPARED A in the WAL, the decoder should always output these events in exactly that order. And not ever COMMIT B, PREPARE A, COMMIT PREPARED A (which is currently violated in the expectation for twophase_snapshot, because the COMMIT for `s1insert` there appears after the PREPARE of `s2p` in the WAL, but gets decoded before it). The patch I'm attaching corrects this expectation in twophase_snapshot, adds an explanatory diagram, and eliminates any danger of sending PREPAREs at COMMIT PREPARED time. Thereby preserving the ordering of PREPAREs vs COMMITs. Given the output plugin supports two-phase commit, I argue there must be a good reason for it setting the start_decoding_at LSN to a point in time after a PREPARE. To me that means the output plugin (or its downstream replica) has processed the PREPARE (and the downstream replica did whatever it needed to do on its side in order to make the transaction ready to be committed in a second phase). (In the weird case of an output plugin that wants to enable two-phase commit but does not really support it downstream, it's still possible for it to hold back LSN confirmations for prepared-but-still-in-flight transactions. However, I'm having a hard time justifying this use case.) With that line of thinking, the point in time (or in WAL) of the COMMIT PREPARED does not matter at all to reason about the decoding of the PREPARE operation. Instead, there are only exactly two cases to consider: a) the PREPARE happened before the start_decoding_at LSN and must not be decoded. (But the effects of the PREPARE must then be included in the initial synchronization. If that's not supported, the output plugin should not enable two-phase commit.) b) the PREPARE happens after the start_decoding_at LSN and must be decoded. (It obviously is not included in the initial synchronization or decoded by a previous instance of the decoder process.) The case where the PREPARE lies before SNAPBUILD_CONSISTENT must always be case a) where we must not repeat the PREPARE, anyway. And in case b) where we need a consistent snapshot to decode the PREPARE, existing provisions already guarantee that to be possible (or how would this be different from a regular single-phase commit?). Please let me know what you think and whether this approach is feasible for you as well. Regards Markus
Attachment
pgsql-hackers by date: