Re: corruption of WAL page header is never reported - Mailing list pgsql-hackers

From Yugo NAGATA
Subject Re: corruption of WAL page header is never reported
Date
Msg-id 20210719160039.23486c8b79d2e89a3a21a978@sraoss.co.jp
Whole thread Raw
In response to Re: corruption of WAL page header is never reported  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: corruption of WAL page header is never reported  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Mon, 19 Jul 2021 15:14:41 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

> Hello.
> 
> At Sun, 18 Jul 2021 04:55:05 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in 
> > Hello,
> > 
> > I found that any corruption of WAL page header found during recovery is never
> > reported in log messages. If wal page header is broken, it is detected in
> > XLogReaderValidatePageHeader called from  XLogPageRead, but the error messages
> > are always reset and never reported.
> 
> Good catch!  Currently recovery stops showing no reason if it is
> stopped by page-header errors.
> 
> > I attached a patch to fix it in this way.
> 
> However, it is a kind of a roof-over-a-roof.  What we should do is
> just omitting the check in XLogPageRead while in standby mode.

Your patch doesn't fix the issue that the error message is never reported in
standby mode. When a WAL page header is broken, the standby would silently repeat
retrying forever.

I think we have to let users know the corruption of WAL page header even in
standby mode, not? A corruption of WAL record header is always reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)


Regards,
Yugo Nagata


-- 
Yugo NAGATA <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: automatically generating node support functions
Next
From: Michael Paquier
Date:
Subject: Re: Introduce pg_receivewal gzip compression tests