Antonin Houska <ah@cybertec.at> wrote:
> Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> > * By throwing at the patchset `make installcheck` I'm getting from time to time
> > and error on the restart:
> >
> > TRAP: FailedAssertion("BufferIsValid(buffers[nbuffers].buffer)",
> > File: "undorecordset.c", Line: 1098, PID: 6055)
> >
> > From what I see XLogReadBufferForRedoExtended finds an invalid buffer and
> > returns BLK_NOTFOUND. The commentary says:
> >
> > If the block was not found, then it must be discarded later in
> > the WAL.
> >
> > and continues with skip = false, but fails to get a page from an invalid
> > buffer few lines later. It seems that the skip flag is supposed to be used
> > this situation, should it also guard the BufferGetPage part?
>
> I could see this sometime too, but can't reproduce it now. It's also not clear
> to me how XLogReadBufferForRedoExtended() can return BLK_NOTFOUND, as the
> whole undo log segment is created at once, even if only part of it is needed -
> see allocate_empty_undo_segment().
I could eventually reproduce the problem. The root cause was that WAL records
were created even for temporary / unlogged undo, and thus only empty pages
could be found during replay. I've fixed that and also setup regular test for
the BLK_NOTFOUND value. That required a few more fixes to UndoReplay().
Attached is a new version.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com