Re: Corruption during WAL replay - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Corruption during WAL replay
Date
Msg-id 3193652.1648186725@sss.pgh.pa.us
Whole thread Raw
In response to Re: Corruption during WAL replay  (Andres Freund <andres@anarazel.de>)
Responses Re: Corruption during WAL replay  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> Ah, and that's finally also the explanation why I couldn't reproduce the
> failure it in a different directory, with an otherwise identically configured
> PG: The length of the path to the tablespace influences the size of the
> XLOG_TBLSPC_CREATE record.

Ooooohhh ... yeah, that could explain a lot of cross-animal variation.

> Not sure what to do here... I guess we can just change the value we overwrite
> the page with and hope to not hit this again? But that feels deeply deeply
> unsatisfying.

AFAICS, this strategy of whacking a predetermined chunk of the page with
a predetermined value is going to fail 1-out-of-64K times.  We have to
change the test so that it's guaranteed to produce an invalid checksum.
Inverting just the checksum field, without doing anything else, would
do that ... but that feels pretty unsatisfying too.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Failed transaction statistics to measure the logical replication progress
Next
From: Andres Freund
Date:
Subject: Re: Corruption during WAL replay