Re: [BUG] Panic due to incorrect missingContrecPtr after promotion - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Date
Msg-id 1577E3F5-E9A0-4E08-9B8D-8486572F354C@amazon.com
Whole thread Raw
In response to Re: [BUG] Panic due to incorrect missingContrecPtr after promotion  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
List pgsql-hackers
> Would you mind trying the second attached to abtain detailed log on
> your testing environment? With the patch, the modified TAP test yields
> the log lines like below.

I applied the logging patch to 13.7 ( attached is the backport ) and repro'd the 
Issue.

I stripped out the relevant parts of the file. Let me know if this is
helpful.

postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG:  ### [S] @0/48A3400: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=0,SbyModeReq=1
 
postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG:  ### [F] @0/10000000: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=0,SbyModeReq=1
 
postgresql.log.2022-08-05-17:2022-08-05 17:22:21 UTC::@:[359]:LOG:  ### [S] @0/10000060: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:00 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:05 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:06 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:11 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=1,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG:  ### [S] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=0,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:50:13 UTC::@:[359]:LOG:  ### [S] @6/B8000198: abort=(0/0)0/0, miss=(0/0)0/0,
SbyMode=0,SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG:  ### [A] @6/F3FFFF20: abort=(6/F3FFFF20)0/0,
miss=(6/F4000000)0/0,SbyMode=0, SbyModeReq=1
 
postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG:  ### [S] @6/F4000030: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:06 UTC::@:[359]:LOG:  ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:11 UTC::@:[359]:LOG:  ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:13 UTC::@:[359]:LOG:  ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:18 UTC::@:[359]:LOG:  ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:21 UTC::@:[359]:LOG:  ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 
postgresql.log.2022-08-05-20:2022-08-05 20:38:21 UTC::@:[359]:LOG:  ### [S] @E/17A78298: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=0, SbyModeReq=1
 
....
.......
.........

2022-08-05 20:38:21 UTC::@:[359]:LOG:  received promote request
2022-08-05 20:38:21 UTC::@:[359]:LOG:  redo done at E/17A78270
2022-08-05 20:38:21 UTC::@:[359]:LOG:  last completed transaction was at log time 2022-08-05 20:38:06.398382+00
recovering 000000020000000E00000005
000000020000000E00000005 archive /rdsdbdata/log/restore/pg-wal-archive.901.* is not yet downloaded, exiting restore
scriptfor now
 
2022-08-05 20:38:21 UTC::@:[359]:LOG:  ### [S] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=0,
SbyModeReq=1
recovering 00000003.history
2022-08-05 20:38:21 UTC::@:[359]:LOG:  selected new timeline ID: 3
2022-08-05 20:38:21 UTC::@:[359]:LOG:  archive recovery complete
recovering 00000002.history
2022-08-05 20:38:22 UTC::@:[373]:LOG:  request to flush past end of generated WAL; request D/C7A425C0, currpos
6/F4000088
2022-08-05 20:38:22 UTC::@:[373]:CONTEXT:  writing block 0 of relation base/14301/26678
2022-08-05 20:38:22 UTC::@:[373]:ERROR:  xlog flush request D/C7A425C0 is not satisfied --- flushed only to 6/F4000088
2022-08-05 20:38:22 UTC::@:[373]:CONTEXT:  writing block 0 of relation base/14301/26678
2022-08-05 20:38:22 UTC::@:[357]:LOG:  database system is ready to accept connections
2022-08-05 20:38:23 UTC::@:[373]:LOG:  checkpoint starting: immediate force wait wal time
2022-08-05 20:38:23 UTC::@:[373]:LOG:  request to flush past end of generated WAL; request D/F5873CF0, currpos
6/F8000060
2022-08-05 20:38:23 UTC::@:[373]:PANIC:  xlog flush request D/F5873CF0 is not satisfied --- flushed only to 6/F8000060
2022-08-05 20:38:23 UTC::@:[357]:LOG:  checkpointer process (PID 373) was terminated by signal 6: Aborted
2022-08-05 20:38:23 UTC::@:[357]:LOG:  terminating any other active server processes
/etc/rds/dbbin/pgscripts/rds_wal_archive: line 19: 14546 Quit                    (core dumped) usleep 50000
2022-08-05 20:38:23 UTC::@:[375]:FATAL:  archive command failed with exit code 131
2022-08-05 20:38:23 UTC::@:[375]:DETAIL:  The failed archive command was: /etc/rds/dbbin/pgscripts/rds_wal_archive
pg_wal/00000003000000060000003D


Thanks,

Sami Imseih
Amazon Web Services



Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Cleaning up historical portability baggage
Next
From: Tom Lane
Date:
Subject: Draft back-branch release notes are up