Thread: nbtdesc.c and nbtpage.c are inconsistent withXLOG_BTREE_META_CLEANUP (11~)
nbtdesc.c and nbtpage.c are inconsistent withXLOG_BTREE_META_CLEANUP (11~)
From
Michael Paquier
Date:
Hi all, (Adding in CC relevant committer and author, Teodor and Alexander) I have been looking today at a crash of pg_waldump on one of the test builds keeping running in some internal environment. Luckily, I have been able to put my hands on a core file with 11.2 running. The backtrace is not that interesting: (gdb) bt #0 btree_desc (buf=0x0, record=0x22ce590) at nbtdesc.c:103 #1 0x000000000040419f in XLogDumpDisplayRecord (config=0x7fff1ccfd360, record=0x22ce590) at /build/mts/release/bora-13598999/vpostgres/src/postgres/src/bin/pg_waldump/pg_waldump.c:558 #2 main (argc=<optimized out>, argv=<optimized out>) at /build/mts/release/bora-13598999/vpostgres/src/postgres/src/bin/pg_waldump/pg_waldump.c:1170 (gdb) down 1 #0 btree_desc (buf=0x0, record=0x22ce590) at nbtdesc.c:103 103 nbtdesc.c: No such file or directory. (gdb) p record $1 = (XLogReaderState *) 0x22ce590 (gdb) p *record $2 = {wal_segment_size = 16777216, read_page = 0x405220 <XLogDumpReadPage>, system_identifier = 0, private_data = 0x7fff1ccfd380, ReadRecPtr = 67109592, EndRecPtr = 67109672, decoded_record = 0x22cf178, main_data = 0x0, main_data_len = 0, main_data_bufsz = 0, record_origin = 0, blocks = {{in_use = true, rnode = {spcNode = 16399, dbNode = 16386, relNode = 19907}, forknum = MAIN_FORKNUM, blkno = 0, flags = 96 '`', has_image = false, apply_image = false, bkp_image = 0x0, hole_offset = 0, hole_length = 0, bimg_len = 0, bimg_info = 0 '\000', has_data = true, data = 0x22db2c0 "\003", data_len = 32, data_bufsz = 8192}, {in_use = false, rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forknum = MAIN_FORKNUM, blkno = 0, flags = 0 '\000', has_image = false, apply_image = false, bkp_image = 0x0, hole_offset = 0, hole_length = 0, bimg_len = 0, bimg_info = 0 '\000', has_data = false, data = 0x0, data_len = 0, data_bufsz = 0} <repeats 32 times>}, max_block_id = 0, readBuf = 0x22ceea0 "\230\320\a", readLen = 8192, readSegNo = 4, readOff = 0, readPageTLI = 0, latestPagePtr = 67108864, latestPageTLI = 1, currRecPtr = 67109592, currTLI = 0, currTLIValidUntil = 0, nextTLI = 0, readRecordBuf = 0x22d12b0 "L", readRecordBufSize = 40960, errormsg_buf = 0x22d0eb0 ""} (gdb) p xlrec $5 = (xl_btree_metadata *) 0x0 Anyway, after looking at the code relevant to XLOG_BTREE_META_CLEANUP, I have noticed that the meta-data associated to the first buffer is registered via XLogRegisterBufData() (this is correct because we want to associate this data to the metadata buffer). However, nbtdesc.c assumes that xl_btree_metadata is from the main record data, causing a crash because we have nothing in this case. I think that we could just patch nbtpage.c so as we fetch the metadata using XLogRecGetBlockData(), and log its data. The error comes from 3d92796, which is one year-old and has been visibly committed untested. I am surprised that we have not seen that complain yet. Attached is a patch, which looks right to me and should be back-patched down to v11. I have not taken the time to actually test it though. Thoughts? -- Michael
Attachment
Re: nbtdesc.c and nbtpage.c are inconsistent with XLOG_BTREE_META_CLEANUP(11~)
From
Peter Geoghegan
Date:
On Sun, Jun 16, 2019 at 6:31 PM Michael Paquier <michael@paquier.xyz> wrote: > I think that we could just patch nbtpage.c so as we fetch the > metadata using XLogRecGetBlockData(), and log its data. Don't you mean nbtdesc.c? > The error > comes from 3d92796, which is one year-old and has been visibly > committed untested. I am surprised that we have not seen that > complain yet. Why is that surprising? https://coverage.postgresql.org/src/backend/access/rmgrdesc/index.html -- Peter Geoghegan
Re: nbtdesc.c and nbtpage.c are inconsistent withXLOG_BTREE_META_CLEANUP (11~)
From
Michael Paquier
Date:
On Sun, Jun 16, 2019 at 06:54:57PM -0700, Peter Geoghegan wrote: > On Sun, Jun 16, 2019 at 6:31 PM Michael Paquier <michael@paquier.xyz> wrote: >> I think that we could just patch nbtpage.c so as we fetch the >> metadata using XLogRecGetBlockData(), and log its data. > > Don't you mean nbtdesc.c? Yeah I meant nbtdesc.c, sorry. This will have to wait after this week's release for a fix by the way... > Why is that surprising? > > https://coverage.postgresql.org/src/backend/access/rmgrdesc/index.html I would have supposed that more people scan WAL records using the description callbacks. -- Michael
Attachment
Re: nbtdesc.c and nbtpage.c are inconsistent with XLOG_BTREE_META_CLEANUP(11~)
From
Peter Geoghegan
Date:
On Sun, Jun 16, 2019 at 7:05 PM Michael Paquier <michael@paquier.xyz> wrote: > I would have supposed that more people scan WAL records using the > description callbacks. The WAL record in question, XLOG_BTREE_META_CLEANUP, is certainly one of the less common record types used by nbtree. I agree that this should have been tested when it went in, but I'm not surprised that the bug remained undetected for a year. Not that many people use pg_waldump. -- Peter Geoghegan
Re: nbtdesc.c and nbtpage.c are inconsistent withXLOG_BTREE_META_CLEANUP (11~)
From
Michael Paquier
Date:
On Sun, Jun 16, 2019 at 07:14:05PM -0700, Peter Geoghegan wrote: > The WAL record in question, XLOG_BTREE_META_CLEANUP, is certainly one > of the less common record types used by nbtree. I agree that this > should have been tested when it went in, but I'm not surprised that > the bug remained undetected for a year. Not that many people use > pg_waldump. Actually, a simple installcheck generates a handful of them. I have not actually run into a crash, but this causes pg_waldump to describe the record incorrectly. Committed down to 11 after cross-checking that the data inserted in the WAL record and what gets described are both consistent. _bt_restore_meta() does the right thing by the way when restoring the page. -- Michael