Re: [BUG] recovery of prepared transactions during promotion can fail - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: [BUG] recovery of prepared transactions during promotion can fail
Date
Msg-id 20230619.144154.1628674635368115138.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: [BUG] recovery of prepared transactions during promotion can fail  (Michael Paquier <michael@paquier.xyz>)
Responses Re: [BUG] recovery of prepared transactions during promotion can fail
List pgsql-hackers
At Mon, 19 Jun 2023 14:24:44 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 16, 2023 at 04:27:40PM +0200, Julian Markwort wrote:
> > Note that it is important that the PREPARE entry is in the WAL file
> > that PostgreSQL is writing to prior to the inital crash.
> > This has happened repeatedly in production already with a customer
> > that uses prepared transactions quite frequently.  I assume that
> > this has happened for others too, but the circumstances of the crash
> > and the cause are very dubious, and troubleshooting it is pretty
> > difficult.
> 
> I guess that this is a possibility yes.  I have not heard directly
> about such a report, but perhaps that's just because few people use
> 2PC.

+1

> > This behaviour has - apparently unintentionally - been fixed in PG
> > 15 and upwards (see commit 811051c ), as part of a general
> > restructure and reorganization of this portion of xlog.c (see commit
> > 6df1543 ).
> > 
> > Furthermore, it seems this behaviour does not appear in PG 12 and
> > older, due to another possible bug: In PG 13 and newer, the
> > XLogReaderState is reset in XLogBeginRead() before reading WAL in
> > XlogReadTwoPhaseData() in twophase.c .
> > In the older releases (PG <= 12), this reset is not done, so the
> > requested LSN containing the prepared transaction can (by happy
> > coincidence) be read from in-memory buffers, and PostgreSQL
> > consequently manages to come up just fine (as the WAL has already
> > been read into buffers prior to the .partial rename).  If the older
> > releases also where to properly reset the XLogReaderState, they
> > would also fail to find the LSN on disk, and hence PostgreSQL would
> > crash again.
> 
> That's debatable, but I think that I would let v12 and v11 be as they
> are.  v11 is going to be end-of-life soon and we did not have any
> complains on this matter as far as I know, so there is a risk of
> breaking something upon its last release.  (Got some, Err..
> experiences with that in the past).  On REL_11_STABLE, note for
> example the slight difference with the handling of
> recovery_end_command, where we rely on InRecovery rather than
> ArchiveRecoveryRequested.  REL_12_STABLE is in a more consistent shape
> than v11 regarding that.

Agree about 11, it's no use patching. About 12, I slightly prefer
applying this but I'm fine without it since no actual problem are
seen.


> > I've attached patches for PG 14 and PG 13 that mimic the change in
> > PG15 (commit 811051c ) and reorder the crucial events, placing the
> > recovery of prepared transactions *before* renaming the file. 
> 
> Yes, I think that's OK.  I would like to add two things to your
> proposal for all the existing branches.
> - Addition of a comment where RecoverPreparedTransactions() is called
> at the end of recovery to tell that we'd better do that before working
> on the last partial segment of the old timeline.
> - Enforce the use of archiving in 009_twophase.pl.

Both look good to me.

> > My humble opinion is that this fix should be backpatched to PG 14
> > and PG 13.  It's debatable whether the fix needs to be brought back
> > to 12 and older also, as those do not exhibit this issue, but the 
> > order of renaming is still wrong.
> 
> Yeah, I'd rather wait for somebody to complain about that.  And v11 is
> not worth taking risks with at this time of the year, IMHO.

I don't have a complaint as the whole.

> With your fix included, the patch for REL_14_STABLE would be like the
> attached.  Is that OK for you?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Allow pg_archivecleanup to remove backup history files
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Synchronizing slots from primary to standby