Re: corruption of WAL page header is never reported - Mailing list pgsql-hackers

From Ranier Vilela
Subject Re: corruption of WAL page header is never reported
Date
Msg-id CAEudQArCvk=0ukdabGn038XjxhSrTDW794f8_QXCA-Qxd_dmrA@mail.gmail.com
Whole thread Raw
In response to corruption of WAL page header is never reported  (Yugo NAGATA <nagata@sraoss.co.jp>)
Responses Re: corruption of WAL page header is never reported  (Yugo NAGATA <nagata@sraoss.co.jp>)
List pgsql-hackers
Em sáb., 17 de jul. de 2021 às 16:57, Yugo NAGATA <nagata@sraoss.co.jp> escreveu:
Hello,

I found that any corruption of WAL page header found during recovery is never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from  XLogPageRead, but the error messages
are always reset and never reported.

        if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
        {
               /* reset any error XLogReaderValidatePageHeader() might have set */
               xlogreader->errormsg_buf[0] = '\0';
               goto next_record_is_invalid;
        }

Since the commit 06687198018, we call XLogReaderValidatePageHeader here so that
we can check a page header and retry immediately if it's invalid, but the error
message is reset immediately and not reported. I guess the reason why the error
message is reset is because we might get the right WAL after some retries.
However, I think it is better to report the error for each check in order to let
users know the actual issues founded in the WAL.

I attached a patch to fix it in this way.
I think to keep the same behavior as before, is necessary always run:

/* reset any error XLogReaderValidatePageHeader() might have set */
xlogreader->errormsg_buf[0] = '\0';

not?

regards,
Ranier Vilela
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: slab allocator performance issues
Next
From: Tomas Vondra
Date:
Subject: Re: slab allocator performance issues