Re: Refactor recovery conflict signaling a little - Mailing list pgsql-hackers

From Alexander Lakhin
Subject Re: Refactor recovery conflict signaling a little
Date
Msg-id 3e07149d-060b-48a0-8f94-3d5e4946ae45@gmail.com
Whole thread Raw
In response to Re: Refactor recovery conflict signaling a little  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Refactor recovery conflict signaling a little
Re: Refactor recovery conflict signaling a little
List pgsql-hackers
Hello Xuneng and Heikki,

04.03.2026 07:33, Xuneng Zhou wrote:
03.03.2026 17:39, Heikki Linnakangas wrote:
On 24/02/2026 10:00, Alexander Lakhin wrote:
The "terminating process ..." message doesn't appear when the test passes
successfully.
Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL:  could not receive data from WAL stream: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG:  received immediate shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG:  database system is shut down

While for a successful run, I see:
2026-03-07 12:18:17.075 EET startup[285260] DETAIL:  The slot conflicted with xid horizon 677.
2026-03-07 12:18:17.075 EET startup[285260] CONTEXT:  WAL redo at 0/04022130 for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon: 677, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 2, nunused: 0, dead: [35, 36]; blkref #0: rel 1663/16384/16418, blk 10
!!!SignalRecoveryConflict[285260]
!!!ProcessInterrupts[286071]| MyProc->pendingRecoveryConflicts: 16
!!!ProcessRecoveryConflictInterrupts[286071]
!!!ProcessRecoveryConflictInterrupts[286071] pending: 16, reason: 4
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl ERROR:  canceling statement due to conflict with recovery
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl DETAIL:  User was using a logical replication slot that must be invalidated.

(Full logs for this failed run and a good run are attached.)

Best regards,
Alexander
Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning
Next
From: Álvaro Herrera
Date:
Subject: Re: [BUG?] missing array index may result in a wrong constraint name (pg_dump, bin-upgrade, >=18)