Re: corruption of WAL page header is never reported - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: corruption of WAL page header is never reported |
Date | |
Msg-id | 20210902.131716.1067877126401208860.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: corruption of WAL page header is never reported (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: corruption of WAL page header is never reported
Re: corruption of WAL page header is never reported |
List | pgsql-hackers |
At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2021/07/19 18:52, Yugo NAGATA wrote: > > Well, I think my first patch was not wrong. The difference with the > > latest > > patch is just whether we perform the additional check when we are not > > in > > standby mode or not. The behavior is basically the same although which > > function > > detects and prints the page-header error in cases of crash recovery is > > different. > > Yes, so which patch do you think is better? I like your version > because there seems no reason why XLogPageRead() should skip > XLogReaderValidatePageHeader() when not in standby mode. Did you read the comment just above? xlog.c:12523 > * Check the page header immediately, so that we can retry immediately if > * it's not valid. This may seem unnecessary, because XLogReadRecord() > * validates the page header anyway, and would propagate the failure up to > * ReadRecord(), which would retry. However, there's a corner case with > * continuation records, if a record is split across two pages such that So when not in standby mode, the same check is performed by xlogreader which has the responsibility to validate the binary data read by XLogPageRead. The page-header validation is a compromise to save a specific case. > Also I'm tempted to move ereport() and reset of errmsg_buf to > under next_record_is_invalid as follows. That is, in standby mode > whenever we find an invalid record and retry reading WAL page > in XLogPageRead(), we report the error message and reset it. > For now in XLogPageRead(), only XLogReaderValidatePageHeader() > sets errmsg_buf, but in the future other code or function doing that > may be added. For that case, the following change seems more elegant. > Thought? I don't think it is good choice to conflate read-failure and header validation failure from the view of modularity. > * shouldn't be a big deal from a performance point of view. > */ > if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)) > - { > - /* reset any error XLogReaderValidatePageHeader() might have set */ > - xlogreader->errormsg_buf[0] = '\0'; > goto next_record_is_invalid; > - } > return readLen; > @@ -12517,7 +12513,17 @@ next_record_is_invalid: > /* In standby-mode, keep trying */ > if (StandbyMode) > + { > + if (xlogreader->errormsg_buf[0]) > + { > + ereport(emode_for_corrupt_record(emode, EndRecPtr), > + (errmsg_internal("%s", xlogreader->errormsg_buf))); > + > + /* reset any error XLogReaderValidatePageHeader() might have set */ > + xlogreader->errormsg_buf[0] = '\0'; > + } > goto retry; > + } > else > return -1; > } regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: