Re: pg15b3: recovery fails with wal prefetch enabled - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: pg15b3: recovery fails with wal prefetch enabled
Date
Msg-id 20220901005239.GC31833@telsasoft.com
Whole thread Raw
In response to Re: pg15b3: recovery fails with wal prefetch enabled  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: pg15b3: recovery fails with wal prefetch enabled
List pgsql-hackers
On Thu, Sep 01, 2022 at 12:05:36PM +1200, Thomas Munro wrote:
> On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > < 2022-08-31 08:44:10.495 CDT  >LOG:  checkpoint starting: end-of-recovery immediate wait
> > < 2022-08-31 08:44:10.609 CDT  >LOG:  request to flush past end of generated WAL; request 1201/1CAF84F0, current
position1201/1CADB730
 
> > < 2022-08-31 08:44:10.609 CDT  >CONTEXT:  writing block 0 of relation base/16881/2840_vm
> > < 2022-08-31 08:44:10.609 CDT  >ERROR:  xlog flush request 1201/1CAF84F0 is not satisfied --- flushed only to
1201/1CADB730
> > < 2022-08-31 08:44:10.609 CDT  >CONTEXT:  writing block 0 of relation base/16881/2840_vm
> > < 2022-08-31 08:44:10.609 CDT  >FATAL:  checkpoint request failed
> >
> > I was able to start it with -c recovery_prefetch=no, so it seems like
> > prefetch tried to do too much.  The VM runs centos7 under qemu.
> > I'm making a copy of the data dir in cases it's needed.
> 
> Hmm, a page with an LSN set 118208 bytes past the end of WAL.  It's a
> vm fork page (which recovery prefetch should ignore completely).  Did
> you happen to get a copy before the successful recovery?  After the
> successful recovery, what LSN does that page have, and can you find
> the references to it in the WAL with eg pg_waldump -R 1663/16681/2840
> -F vm?  Have you turned FPW off (perhaps this is on ZFS?)?

Yes, I have a copy that reproduces the issue:

#1  0x00000000009a0df4 in errfinish (filename=<optimized out>, filename@entry=0xa15535 "xlog.c",
lineno=lineno@entry=2671,funcname=funcname@entry=0xa1da27 <__func__.22763> "XLogFlush") at elog.c:588
 
#2  0x000000000055f1cf in XLogFlush (record=19795985532144) at xlog.c:2668
#3  0x0000000000813b24 in FlushBuffer (buf=0x7fffdf1f8900, reln=<optimized out>) at bufmgr.c:2889
#4  0x0000000000817a5b in SyncOneBuffer (buf_id=buf_id@entry=7796, skip_recently_used=skip_recently_used@entry=false,
wb_context=wb_context@entry=0x7fffffffcdf0)at bufmgr.c:2576
 
#5  0x00000000008181c2 in BufferSync (flags=flags@entry=358) at bufmgr.c:2164
#6  0x00000000008182f5 in CheckPointBuffers (flags=flags@entry=358) at bufmgr.c:2743
#7  0x00000000005587b2 in CheckPointGuts (checkPointRedo=19795985413936, flags=flags@entry=358) at xlog.c:6855
#8  0x000000000055feb3 in CreateCheckPoint (flags=flags@entry=358) at xlog.c:6534
#9  0x00000000007aceaa in CheckpointerMain () at checkpointer.c:455
#10 0x00000000007aac52 in AuxiliaryProcessMain (auxtype=auxtype@entry=CheckpointerProcess) at auxprocess.c:153
#11 0x00000000007b0bd8 in StartChildProcess (type=<optimized out>) at postmaster.c:5430
#12 0x00000000007b388f in PostmasterMain (argc=argc@entry=7, argv=argv@entry=0xf139e0) at postmaster.c:1463
#13 0x00000000004986a6 in main (argc=7, argv=0xf139e0) at main.c:202

It's not on zfs, and FPW have never been turned off.

I should add that this instance has been pg_upgraded since v10.

BTW, base/16881 is the postgres DB )which has 43GB of logfiles imported from
CSV, plus 2GB of snapshots of pg_control_checkpoint, pg_settings,
pg_stat_bgwriter, pg_stat_database, pg_stat_wal).

postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_2619', 'main', 0));
      lsn      | checksum | flags | lower | upper | special | pagesize | version | prune_xid  
---------------+----------+-------+-------+-------+---------+----------+---------+------------
 1201/1CDD1F98 |    -6200 |     1 |    44 |   424 |    8192 |     8192 |       4 | 3681043287
(1 fila)

postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_2619', 'vm', 0));
      lsn      | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
 1201/1CAF84F0 |    -6010 |     0 |    24 |  8192 |    8192 |     8192 |       4 |         0

I found this in waldump (note that you had a typoe - it's 16881).

[pryzbyj@template0 ~]$ sudo /usr/pgsql-15/bin/pg_waldump -R 1663/16881/2840 -F vm -p /mnt/tmp/15/data/pg_wal
00000001000012010000001C
rmgr: Heap2       len (rec/tot):     64/   122, tx:          0, lsn: 1201/1CAF2658, prev 1201/1CAF2618, desc: VISIBLE
cutoffxid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0 FPW, blkref #1: rel 1663/16881/2840 blk
54
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF3AF8, prev 1201/1CAF2788, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 0
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF3B70, prev 1201/1CAF3B38, desc: VISIBLE
cutoffxid 3671427998 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 2
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF4DC8, prev 1201/1CAF3BB0, desc: VISIBLE
cutoffxid 3672889900 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 4
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF5FB0, prev 1201/1CAF4E08, desc: VISIBLE
cutoffxid 3679743844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 5
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7320, prev 1201/1CAF5FF0, desc: VISIBLE
cutoffxid 3679743844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 6
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7398, prev 1201/1CAF7360, desc: VISIBLE
cutoffxid 3679751919 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 11
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7410, prev 1201/1CAF73D8, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 17
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7488, prev 1201/1CAF7450, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 19
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7500, prev 1201/1CAF74C8, desc: VISIBLE
cutoffxid 3645406844 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 23
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7578, prev 1201/1CAF7540, desc: VISIBLE
cutoffxid 3669978567 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 24
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF75F0, prev 1201/1CAF75B8, desc: VISIBLE
cutoffxid 0 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 25
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7668, prev 1201/1CAF7630, desc: VISIBLE
cutoffxid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 26
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF76E0, prev 1201/1CAF76A8, desc: VISIBLE
cutoffxid 3671911724 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 27
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7758, prev 1201/1CAF7720, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 34
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF77D0, prev 1201/1CAF7798, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 35
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7EF8, prev 1201/1CAF7810, desc: VISIBLE
cutoffxid 3672408544 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 37
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7F70, prev 1201/1CAF7F38, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 38 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF7FE8, prev 1201/1CAF7FB0, desc: VISIBLE
cutoffxid 2 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 39
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF8078, prev 1201/1CAF8040, desc: VISIBLE
cutoffxid 3678237783 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 41
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF80F0, prev 1201/1CAF80B8, desc: VISIBLE
cutoffxid 3672408544 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 42
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF8168, prev 1201/1CAF8130, desc: VISIBLE
cutoffxid 3680789266 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 43
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF81E0, prev 1201/1CAF81A8, desc: VISIBLE
cutoffxid 3667994218 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 44
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF8258, prev 1201/1CAF8220, desc: VISIBLE
cutoffxid 3680789266 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 45
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF82D0, prev 1201/1CAF8298, desc: VISIBLE
cutoffxid 3673830395 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 48
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF8348, prev 1201/1CAF8310, desc: VISIBLE
cutoffxid 0 flags 0x03, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 50
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF83C0, prev 1201/1CAF8388, desc: VISIBLE
cutoffxid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 51
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF8438, prev 1201/1CAF8400, desc: VISIBLE
cutoffxid 3681024856 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 52
 
rmgr: Heap2       len (rec/tot):     59/    59, tx:          0, lsn: 1201/1CAF84B0, prev 1201/1CAF8478, desc: VISIBLE
cutoffxid 3678741092 flags 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0, blkref #1: rel 1663/16881/2840 blk 53
 
pg_waldump: error: error en registro de WAL en 1201/1CD90E48: invalid record length at 1201/1CD91010: wanted 24, got 0

I could send our WAL to you if that's desirable ..

-- 
Justin



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: POC: Better infrastructure for automated testing of concurrency issues
Next
From: Tom Lane
Date:
Subject: Re: Solaris "sed" versus pre-v13 plpython tests