Home > mailing lists

Re: A failure in 031_recovery_conflict.pl on Debian/s390x - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: A failure in 031_recovery_conflict.pl on Debian/s390x
Date	August 10, 2023 13:55:24
Msg-id	CA+hUKGJs8mskHt=38dFQYkucv0H44xTy=EDF0=D0sGuJms3DBw@mail.gmail.com Whole thread Raw
In response to	Re: A failure in 031_recovery_conflict.pl on Debian/s390x (Christoph Berg <myon@debian.org>)
Responses	Re: A failure in 031_recovery_conflict.pl on Debian/s390x
List	pgsql-hackers

Tree view

On Thu, Aug 10, 2023 at 9:15 PM Christoph Berg <myon@debian.org> wrote:
> No XXX lines this time either, but I've seen then im logfiles that
> went through successfully.

Hmm.  Well, I think this looks like a different kind of bug then.
That patch of mine is about fixing some unsafe coding on the receiving
side of a signal.  In this case it's apparently not being sent.  So
either the Heap2/PRUNE record was able to proceed (indicating that
that CURSOR was not holding a pin as expected), or VACUUM decided not
to actually do anything to that block (conditional cleanup lock vs
transient pin changing behaviour?), or there's a bug somewhere in/near
LockBufferForCleanup(), which should have emitted that XXX message
before even calling ResolveRecoveryConflictWithBufferPin().

Do you still have the data directories around from that run, so we can
see if the expected Heap2/PRUNE was actually logged?  For example
(using meson layout here, in the build directory) that'd be something
like:

$ ./tmp_install/home/tmunro/install/bin/pg_waldump
testrun/recovery/031_recovery_conflict/data/t_031_recovery_conflict_standby_data/pgdata/pg_wal/000000010000000000000003

In there I see this:

rmgr: Heap2       len (rec/tot):     57/    57, tx:          0, lsn:
0/0344BB90, prev 0/0344BB68, desc: PRUNE snapshotConflictHorizon: 0,
nredirected: 0, ndead: 1, nunused: 0, redirected: [], dead: [21],
unused: [], blkref #0: rel 1663/16385/16386 blk 0

That's the WAL record that's supposed to be causing
031_recovery_conflict_standby.log to talk about a conflict, starting
with this:

2023-08-10 22:47:04.564 NZST [57145] LOG:  recovery still waiting
after 10.035 ms: recovery conflict on buffer pin
2023-08-10 22:47:04.564 NZST [57145] CONTEXT:  WAL redo at 0/344BB90
for Heap2/PRUNE: snapshotConflictHorizon: 0, nredirected: 0, ndead: 1,
 nunused: 0, redirected: [], dead: [21], unused: []; blkref #0: rel
1663/16385/16386, blk 0

pgsql-hackers by date:

From: Michael Paquier
Date: 10 August 2023, 13:54:41
Subject: Re: [PATCH] Add loongarch native checksum implementation.

From: John Naylor
Date: 10 August 2023, 14:37:24
Subject: Re: [PATCH] Add loongarch native checksum implementation.

Re: A failure in 031_recovery_conflict.pl on Debian/s390x - Mailing list pgsql-hackers

Previous

Next