Re: corruption of WAL page header is never reported - Mailing list pgsql-hackers

From Yugo NAGATA
Subject Re: corruption of WAL page header is never reported
Date
Msg-id 20210718232716.4a22d1fd673c0f633f68e5ac@sraoss.co.jp
Whole thread Raw
In response to Re: corruption of WAL page header is never reported  (Ranier Vilela <ranier.vf@gmail.com>)
List pgsql-hackers
On Sat, 17 Jul 2021 18:40:02 -0300
Ranier Vilela <ranier.vf@gmail.com> wrote:

> Em sáb., 17 de jul. de 2021 às 16:57, Yugo NAGATA <nagata@sraoss.co.jp>
> escreveu:
> 
> > Hello,
> >
> > I found that any corruption of WAL page header found during recovery is
> > never
> > reported in log messages. If wal page header is broken, it is detected in
> > XLogReaderValidatePageHeader called from  XLogPageRead, but the error
> > messages
> > are always reset and never reported.
> >
> >         if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr,
> > readBuf))
> >         {
> >                /* reset any error XLogReaderValidatePageHeader() might
> > have set */
> >                xlogreader->errormsg_buf[0] = '\0';
> >                goto next_record_is_invalid;
> >         }
> >
> > Since the commit 06687198018, we call XLogReaderValidatePageHeader here so
> > that
> > we can check a page header and retry immediately if it's invalid, but the
> > error
> > message is reset immediately and not reported. I guess the reason why the
> > error
> > message is reset is because we might get the right WAL after some retries.
> > However, I think it is better to report the error for each check in order
> > to let
> > users know the actual issues founded in the WAL.
> >
> > I attached a patch to fix it in this way.
> >
> I think to keep the same behavior as before, is necessary always run:
> 
> /* reset any error XLogReaderValidatePageHeader() might have set */
> xlogreader->errormsg_buf[0] = '\0';
> 
> not?

If we are not in StandbyMode, the check is not retried, and an error is returned
immediately. So, I think ,we don't have to display an error message in such cases,
and neither reset it. Instead, it would be better to leave the error message
handling to the caller of XLogReadRecord.

Regards,
Yugo Nagat

-- 
Yugo NAGATA <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Remove redundant strlen call in ReplicationSlotValidateName
Next
From: Dilip Kumar
Date:
Subject: Re: Toast compression method options