Re: Corruption during WAL replay - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Corruption during WAL replay
Date
Msg-id 20220325053445.4clfc7of3y5yvesy@alap3.anarazel.de
Whole thread Raw
In response to Re: Corruption during WAL replay  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2022-03-25 01:23:00 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I do see that the LSN that ends up on the page is the same across a few runs
> > of the test on serinus. Which presumably differs between different
> > animals. Surprised that it's this predictable - but I guess the run is short
> > enough that there's no variation due to autovacuum, checkpoints etc.
> 
> Uh-huh.  I'm not surprised that it's repeatable on a given animal.
> What remains to be explained:
> 
> 1. Why'd it start failing now?  I'm guessing that ce95c5437 *was* the
> culprit after all, by slightly changing the amount of catalog data
> written during initdb, and thus moving the initial LSN.

Yep, verified that (see mail I just sent).


> 2. Why just these two animals?  If initial LSN is the critical thing,
> then the results of "locale -a" would affect it, so platform
> dependence is hardly surprising ... but I'd have thought that all
> the animals on that host would use the same initial set of
> collations.

I think it's the animal's name that makes the difference, due to the
tablespace path lenght thing. And while I was confused for a second by

petalura
pogona
serinus
dragonet

failing, despite different name lengths, it still makes sense: We MAXALIGN the
start of records. Which explains why flaviventris didn't fail the same way.


> As for a fix, would damaging more of the page help?  I guess
> it'd just move around the one-in-64K chance of failure.

As I wrote in the other email, I think spreading the changes out wider might
help. But it's still not great. However:

> Maybe we have to intentionally corrupt (e.g. invert) the
> checksum field specifically.

seems like it'd do the trick? Even a single bit change of the checksum ought
to do, as long as we don't set it to 0.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Corruption during WAL replay
Next
From: Masahiko Sawada
Date:
Subject: Re: Failed transaction statistics to measure the logical replication progress