Re: pg15b3: recovery fails with wal prefetch enabled - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: pg15b3: recovery fails with wal prefetch enabled
Date
Msg-id CA+hUKGLHZtSJEtc+hvg0=mawEJHhsbyq6myN65VPMHH=C-Kf9g@mail.gmail.com
Whole thread Raw
In response to Re: pg15b3: recovery fails with wal prefetch enabled  ("Jonathan S. Katz" <jkatz@postgresql.org>)
List pgsql-hackers
On Wed, Sep 7, 2022 at 1:56 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> To close this loop, I added a section for "fixed before RC1" to Open
> Items since this is presumably the next release. We can include it there
> once committed.

Done yesterday.

To tie up a couple of loose ends from this thread:

On Thu, Sep 1, 2022 at 2:48 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> Also, pg_waldump seems to fail early with -w:
> [pryzbyj@template0 ~]$ sudo /usr/pgsql-15/bin/pg_waldump -w -R 1663/16881/2840 -F vm -p /mnt/tmp/15/data/pg_wal
00000001000012010000001C
> rmgr: Heap2       len (rec/tot):     64/   122, tx:          0, lsn: 1201/1CAF2658, prev 1201/1CAF2618, desc: VISIBLE
cutoffxid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0 FPW, blkref #1: rel 1663/16881/2840 blk
54
> pg_waldump: error: error in WAL record at 1201/1CD90E48: invalid record length at 1201/1CD91010: wanted 24, got 0

That looks OK to me.  With or without -w, we get as far as
1201/1CD91010 and then hit zeroes.

On Thu, Sep 1, 2022 at 5:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> So it *looks* like it finished early (and without the expected
> error?).  But it also looks like it replayed that record, according to
> the page LSN.  So which is it?

The reason 1201/1CAF84B0 appeared on a page despite not having been
replayed (due to the bug) is just that vismap pages don't follow the
usual logging rules, and can be read in by heap records that don't
mention the vm page (and therefore no FPW).  So we can finish up
reading a page from disk with a future LSN on it.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: pg_walinspect float4/float8 confusion
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: When should we bump the logical replication protocol version?