Re: repeated decoding of prepared transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: repeated decoding of prepared transactions |
Date | |
Msg-id | CAA4eK1JLVfB9hiczRyTt6qLmw90qR3-9ZzeZnHi-52nEw5_SYg@mail.gmail.com Whole thread Raw |
In response to | Re: repeated decoding of prepared transactions (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: repeated decoding of prepared transactions
Re: repeated decoding of prepared transactions |
List | pgsql-hackers |
On Thu, Feb 11, 2021 at 4:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Feb 8, 2021 at 2:01 PM Markus Wanner > <markus.wanner@enterprisedb.com> wrote: > > > Now, coming back to the restart case where the prepared transaction > can be sent again by the publisher. I understand yours and others > point that we should not send prepared transaction if there is a > restart between prepare and commit but there are reasons why we have > done that way and I am open to your suggestions. I'll once again try > to explain the exact case to you which is not very apparent. The basic > idea is that we ship/replay all transactions where commit happens > after the snapshot has a consistent state (SNAPBUILD_CONSISTENT), see > atop snapbuild.c for details. Now, for transactions where prepare is > before snapshot state SNAPBUILD_CONSISTENT and commit prepared is > after SNAPBUILD_CONSISTENT, we need to send the entire transaction > including prepare at the commit time. One might think it is quite easy > to detect that, basically if we skip prepare when the snapshot state > was not SNAPBUILD_CONSISTENT, then mark a flag in ReorderBufferTxn and > use the same to detect during commit and accordingly take the decision > to send prepare but unfortunately it is not that easy. There is always > a chance that on restart we reuse the snapshot serialized by some > other Walsender at a location prior to Prepare and if that happens > then this time the prepare won't be skipped due to snapshot state > (SNAPBUILD_CONSISTENT) but due to start_decodint_at point (considering > we have already shipped some of the later commits but not prepare). > Now, this will actually become the same situation where the restart > has happened after we have sent the prepare but not commit. This is > the reason we have to resend the prepare when the subscriber restarts > between prepare and commit. > After further thinking on this problem and some off-list discussions with Ajin, there appears to be another way to solve the above problem by which we can avoid resending the prepare after restart if it has already been processed by the subscriber. The main reason why we were not able to distinguish between the two cases ((a) prepare happened before SNAPBUILD_CONSISTENT state but commit prepared happened after we reach SNAPBUILD_CONSISTENT state and (b) prepare is already decoded, successfully processed by the subscriber and we have restarted the decoding) is that we can re-use the serialized snapshot at LSN location prior to Prepare of some concurrent WALSender after the restart. Now, if we ensure that we don't use serialized snapshots for decoding via slots where two_phase decoding option is enabled then we won't have that problem. The drawback is that in some cases it can take a bit more time for initial snapshot building but maybe that is better than the current solution. Any suggestions? -- With Regards, Amit Kapila.
pgsql-hackers by date: