Re: 回复: Why is XLOG_FPI_FOR_HINT always need backups? - Mailing list pgsql-hackers
| From | Kyotaro Horiguchi |
|---|---|
| Subject | Re: 回复: Why is XLOG_FPI_FOR_HINT always need backups? |
| Date | |
| Msg-id | 20210716.163130.393769596852922603.horikyota.ntt@gmail.com Whole thread Raw |
| In response to | Re: 回复: Why is XLOG_FPI_FOR_HINT always need backups? (Fujii Masao <masao.fujii@oss.nttdata.com>) |
| Responses |
Re: 回复: Why is XLOG_FPI_FOR_HINT always need backups?
|
| List | pgsql-hackers |
At Thu, 15 Jul 2021 22:50:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> On 2021/07/07 16:11, Kyotaro Horiguchi wrote:
> > The doc of wal_log_hints says that "*even* for non-critical
> > modifications of so-called hint bits", which seems to me implies it is
> > following full_page_writes (and I think it is nonsense otherwise, as
> > you suspect).
> > XLogSaveBufferForHint sets REGBUF_FORCE_IMAGE since 2c03216d83116 when
> > the symbol was introduced. As my understanding XLogInsert did not have
> > an ability to enforce FPIs before the commit. The code comment above
> > is older than that commit. So it seems to me a thinko that
> > XLogSaveBufferForHint sets REGBUF_FORCE_IMAGE.
> > I think the attached fixes that thinko.
>
> With the patch, I got the following error during crash recovery.
> I guess this happened because XLOG_FPI_FOR_HINT record had
> no backup blocks even though the replay logic for XLOG_FPI_FOR_HINT
> assumes it contains backup blocks.
>
> FATAL: unexpected XLogReadBufferForRedo result when restoring backup
> block
> CONTEXT: WAL redo at 0/169C600 for XLOG/FPI_FOR_HINT: ; blkref #0: rel
> 1663/13236/16385, blk 0
Sorry, I missed that the XLogReadBufferForRedo is expected to return
BLK_RESTORED. And XLogReadBufferForRedo errors out when it tries to
read nonexistent page without having an FPI (this happens for FSM
pages). Rather than teaching XLogReadBufferExtended to behave
differrently for the case, I choosed to avoid trying to load the page
when the corresponding FPI block is missing in XLOG_FPI_FOR_HINT, as
if the record itself did not exist at all.
Since differently from XLOG_FPI, XLOG_FPI_FOR_HINT has only one page
reference at most, but in the attached the decision whether to read
the page or not is made for each block.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From a6ce75300d0d0e0234ff0292fda0fe8c2a168ada Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Wed, 7 Jul 2021 15:34:41 +0900
Subject: [PATCH v2] Make FPI_FOR_HINT follow standard FPI emitting policy
Commit 2c03216d831160bedd72d45f712601b6f7d03f1c changed
XLogSaveBufferForHint to enforce FPI but the caller didn't intend to
do so. Restore the intended behavior that FPI_FOR_HINT follows
full_page_writes.
---
src/backend/access/transam/xlog.c | 20 ++++++++++++++++----
src/backend/access/transam/xloginsert.c | 5 +----
2 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index edb15fe58d..dcd71c01fb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10126,7 +10126,10 @@ xlog_redo(XLogReaderState *record)
uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
XLogRecPtr lsn = record->EndRecPtr;
- /* in XLOG rmgr, backup blocks are only used by XLOG_FPI records */
+ /*
+ * in XLOG rmgr, backup blocks are only used by XLOG_FPI and
+ * XLOG_FPI_FOR_HINT records.
+ */
Assert(info == XLOG_FPI || info == XLOG_FPI_FOR_HINT ||
!XLogRecHasAnyBlockRefs(record));
@@ -10345,14 +10348,23 @@ xlog_redo(XLogReaderState *record)
*
* XLOG_FPI_FOR_HINT records are generated when a page needs to be
* WAL- logged because of a hint bit update. They are only generated
- * when checksums are enabled. There is no difference in handling
- * XLOG_FPI and XLOG_FPI_FOR_HINT records, they use a different info
- * code just to distinguish them for statistics purposes.
+ * when checksums and/or wal_log_hints are enabled. The only difference
+ * in handling XLOG_FPI and XLOG_FPI_FOR_HINT records is the latter is
+ * allowed to be missing actual block image. In that case the record is
+ * effectively a NOP record and should not even try to read the page
+ * from disk.
*/
for (uint8 block_id = 0; block_id <= record->max_block_id; block_id++)
{
Buffer buffer;
+ if (!XLogRecHasBlockImage(record, block_id))
+ {
+ if (info == XLOG_FPI)
+ elog(ERROR, "missing full page image in XLOG_FPI record");
+ continue;
+ }
+
if (XLogReadBufferForRedo(record, block_id, &buffer) != BLK_RESTORED)
elog(ERROR, "unexpected XLogReadBufferForRedo result when restoring backup block");
UnlockReleaseBuffer(buffer);
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3d2c9c3e8c..e596a0470a 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -287,8 +287,6 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
{
registered_buffer *regbuf;
- /* This is currently only used to WAL-log a full-page image of a page */
- Assert(flags & REGBUF_FORCE_IMAGE);
Assert(begininsert_called);
if (block_id >= max_registered_block_id)
@@ -995,7 +993,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
if (lsn <= RedoRecPtr)
{
- int flags;
+ int flags = 0;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
RelFileNode rnode;
@@ -1022,7 +1020,6 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
XLogBeginInsert();
- flags = REGBUF_FORCE_IMAGE;
if (buffer_std)
flags |= REGBUF_STANDARD;
--
2.27.0
pgsql-hackers by date: