[PATCH] Fix PITR pause bypass when initial XLOG_RUNNING_XACTS has subxid overflow - Mailing list pgsql-hackers

From Matt Blewitt
Subject [PATCH] Fix PITR pause bypass when initial XLOG_RUNNING_XACTS has subxid overflow
Date
Msg-id CACy-Nv24ZORVN9_S_yHF5Nsip45HKCBtKVNC3XdKgz+1wvGvEQ@mail.gmail.com
Whole thread
Responses Re: [PATCH] Fix PITR pause bypass when initial XLOG_RUNNING_XACTS has subxid overflow
List pgsql-hackers
Hi folks,

We observed a case where our backup tooling was periodically failing
for a specific workload - nested subtrans overflowing subxid. We don't
have visibility on the specific customer workload (i.e. either SAVEPOINT
or EXCEPTION handling), but reproducing is covered in the TAP test.

The problem detail and proposed fix are described below. Happy to discuss further.

Problem: When the first XLOG_RUNNING_XACTS record seen during recovery has
subxid_overflow=true, the standby enters STANDBY_SNAPSHOT_PENDING and
hot standby never activates (LocalHotStandbyActive stays false).

This caused recovery_target_action = 'pause' to be silently bypassed:
recoveryPausesHere() returns immediately when hot standby is not yet
active, so the pause is skipped and the server promotes instead.

Fix: in PerformWalRecovery(), when the recovery target is reached and
the snapshot is still PENDING, force a transition to STANDBY_SNAPSHOT_READY
and call CheckRecoveryConsistency() to activate hot standby before the
target action switch is evaluated.

As I understand it, this is safe because subtransaction
commits write to CLOG but produce no WAL entry, so standbys
always see overflowed subxids as INPROGRESS rather than SUB_COMMITTED.

INPROGRESS subxids are invisible without any SubTrans
lookup, so the missing SubTrans entries that STANDBY_SNAPSHOT_PENDING
guards against cannot cause incorrect visibility results.

Add a TAP test (052_pitr_subxid_overflow.pl) that exercises the scenario:
the overflow transaction is kept open during the base backup's forced
checkpoint so that the very first XLOG_RUNNING_XACTS the standby replays
has subxid_overflow=true.  A named restore point is then created while
the overflow transaction is still open.  Without the fix the standby
promotes silently at the target; with the fix it pauses and accepts
hot-standby queries.

Note: subtransaction XIDs are only assigned when the subtransaction writes,
so gen_subxids() must perform an INSERT at each recursion level to force
the PGPROC subxid cache to overflow.

I would consider this for backporting to supported releases.
Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgstat include expansion
Next
From: David Rowley
Date:
Subject: Re: Partial Mode in Aggregate Functions