Re: Corruption during WAL replay - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Corruption during WAL replay
Date
Msg-id 20220316.141432.2298656526174566963.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Corruption during WAL replay  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Corruption during WAL replay  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
At Tue, 15 Mar 2022 12:44:49 -0400, Robert Haas <robertmhaas@gmail.com> wrote in 
> On Wed, Jan 26, 2022 at 3:25 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > The attached is the fixed version and it surely works with the repro.
> 
> Hi,
> 
> I spent the morning working on this patch and came up with the
> attached version. I wrote substantial comments in RelationTruncate(),
> where I tried to make it more clear exactly what the bug is here, and
> also in storage/proc.h, where I tried to clarify both the use of the
> DELAY_CHKPT_* flags in general terms. If nobody is too sad about this
> version, I plan to commit it.

Thanks for taking this and for the time.  The additional comments
seems describing the flags more clearly.

storage.c:
+     * Make sure that a concurrent checkpoint can't complete while truncation
+     * is in progress.
+     *
+     * The truncation operation might drop buffers that the checkpoint
+     * otherwise would have flushed. If it does, then it's essential that
+     * the files actually get truncated on disk before the checkpoint record
+     * is written. Otherwise, if reply begins from that checkpoint, the
+     * to-be-truncated buffers might still exist on disk but have older
+     * contents than expected, which can cause replay to fail. It's OK for
+     * the buffers to not exist on disk at all, but not for them to have the
+     * wrong contents.

FWIW, this seems like slightly confusing between buffer and its
content.  I can read it correctly so I don't mind if it is natural
enough.

Otherwise all the added/revised comments looks fine. Thanks for the
labor.

> I think it should be back-patched, too, but that looks like a bit of a
> pain. I think every back-branch will require different adjustments.

I'll try that, if you are already working on it, please inform me. (It
may more than likely be too late..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Tablesync early exit
Next
From: Thomas Munro
Date:
Subject: Re: USE_BARRIER_SMGRRELEASE on Linux?