Home > mailing lists

Re: Corruption during WAL replay - Mailing list pgsql-hackers

From	Kyotaro Horiguchi
Subject	Re: Corruption during WAL replay
Date	March 24, 2020 12:18:12
Msg-id	20200324.181812.758439179869796321.horikyota.ntt@gmail.com Whole thread Raw
In response to	Corruption during WAL replay (Teja Mupparti <tejeswarm@hotmail.com>)
Responses	Re: Corruption during WAL replay (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

Thanks for working on this.

At Mon, 23 Mar 2020 20:56:59 +0000, Teja Mupparti <tejeswarm@hotmail.com> wrote in
> This is my *first* attempt to submit a Postgres patch, please let me know if I missed any process or format of the
patch

Welcome! The format looks fine to me. It would be better if it had a
commit message that explains what the patch does. (in the format that
git format-patch emits.)

> The original bug reporting-email and the relevant discussion is here
...
> The crux of the fix is, in the current code, engine drops the buffer and then truncates the file, but a crash before
thetruncate and after the buffer-drop is causing the corruption. Patch reverses the order i.e. truncate the file and
dropthe buffer later.

BufferAlloc doesn't wait for the BM_IO_IN_PROGRESS for a valid buffer.

I'm not sure it's acceptable to remember all to-be-deleted buffers
while truncation.

+ /*START_CRIT_SECTION();*/

Is this a point of argument? It is not needed if we choose the
strategy (c) in [1], since the protocol is aiming to allow server to
continue running after truncation failure.

[1]: https://www.postgresql.org/message-id/20191207001232.klidxnm756wqxvwx%40alap3.anarazel.de

However, note that md truncates a "file" a non-atomic way. mdtruncate
truncates multiple files from the last segment toward the
beginning. If mdtruncate successfully truncated the first several
segments then failed, retaining all buffers triggers assertion failure
in mdwrite while buffer flush.

Some typos found:

+ * a backround task might flush them to the disk right after we

s/backround/background/

+ * saved list of buffers that were marked as BM_IO_IN_PRGRESS just
s/BM_IO_IN_PRGRESS/BM_IO_IN_PROGRESS/

+ * as BM_IO_IN_PROGRES. Though the buffers are marked for IO, they
s/BM_IO_IN_PROGRES/BM_IO_IN_PROGRESS/

+ * being dicarded).
s/dicarded/discarded/

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: Masahiko Sawada
Date: 24 March 2020, 12:07:18
Subject: Re: error context for vacuum to include block number

From: Amit Kapila
Date: 24 March 2020, 12:18:51
Subject: Re: error context for vacuum to include block number

Re: Corruption during WAL replay - Mailing list pgsql-hackers

Previous

Next