Re: v16dev: TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564 - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: v16dev: TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564
Date
Msg-id CAEze2Wg+Pbec3jaaUSF8E6Piy=bq=gXkFCFyaYwFDYTDjCMSvg@mail.gmail.com
Whole thread Raw
In response to v16dev: TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: v16dev: TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564
List pgsql-hackers
On Mon, 17 Apr 2023 at 17:53, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I hit this assertion while pg_restoring data into a v16 instance.
> postgresql16-server-16-alpha_20230417_PGDG.rhel7.x86_64
>
> wal_level=minimal and pg_dump --single-transaction both seem to be
> required to hit the issue.
>
> $ /usr/pgsql-16/bin/postgres -D ./pg16test -c maintenance_work_mem=1GB -c max_wal_size=16GB -c wal_level=minimal -c
max_wal_senders=0-c port=5678 -c logging_collector=no &
 
>
> $ time sudo -u postgres /usr/pgsql-16/bin/pg_restore -d postgres -p 5678 --single-transaction --no-tablespace
./curtables
>
> TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564
>
> Core was generated by `postgres: postgres postgres [local] COMMIT                                    '.
> Program terminated with signal 6, Aborted.
> #0  0x00007f28b8bd5387 in raise () from /lib64/libc.so.6
> Missing separate debuginfos, use: debuginfo-install postgresql16-server-16-alpha_20230417_PGDG.rhel7.x86_64
> (gdb) bt
> #0  0x00007f28b8bd5387 in raise () from /lib64/libc.so.6
> #1  0x00007f28b8bd6a78 in abort () from /lib64/libc.so.6
> #2  0x00000000009bc8c9 in ExceptionalCondition (conditionName=conditionName@entry=0xa373e1 "size > SizeOfXLogRecord",
fileName=fileName@entry=0xa31b13"xlog.c", lineNumber=lineNumber@entry=1055) at assert.c:66
 
> #3  0x000000000057b049 in ReserveXLogInsertLocation (PrevPtr=0x2e3d750, EndPos=<synthetic pointer>,
StartPos=<syntheticpointer>, size=24) at xlog.c:1055
 
> #4  XLogInsertRecord (rdata=rdata@entry=0xf187a0 <hdr_rdt>, fpw_lsn=fpw_lsn@entry=0, flags=<optimized out>,
num_fpi=num_fpi@entry=0,topxid_included=topxid_included@entry=false) at xlog.c:844
 
> #5  0x000000000058210c in XLogInsert (rmid=rmid@entry=0 '\000', info=info@entry=176 '\260') at xloginsert.c:510
> #6  0x0000000000582b09 in log_newpage_range (rel=rel@entry=0x2e1f628, forknum=forknum@entry=FSM_FORKNUM,
startblk=startblk@entry=0,endblk=endblk@entry=3, page_std=page_std@entry=false) at xloginsert.c:1317
 


Looking at log_newpage_range, it seems like we're always trying to log
a record if startblk < endblk; but don't register the PageIsNew()
buffers in the range. That means that if the last buffers in the range
are new, this can result in no buffers being registered in the last
iteration of the main loop (if the number of non-new buffers in the
range is 0 (mod 32)).

A change like attached should fix the issue; or alternatively we could
force log the last (new) buffer when we detect this edge case.

Kind regards,

Matthias van de Meent

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Direct I/O
Next
From: Tom Lane
Date:
Subject: Re: v16dev: TRAP: failed Assert("size > SizeOfXLogRecord"), File: "xlog.c", Line: 1055, PID: 13564