Re: Corruption during WAL replay - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Corruption during WAL replay
Date
Msg-id 3237467.1648216174@sss.pgh.pa.us
Whole thread Raw
In response to Re: Corruption during WAL replay  (Andres Freund <andres@anarazel.de>)
Responses Re: Corruption during WAL replay
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2022-03-25 01:38:45 -0400, Tom Lane wrote:
>> AFAICS, this strategy of whacking a predetermined chunk of the page with
>> a predetermined value is going to fail 1-out-of-64K times.

> Yea. I suspect that the way the modifications and checksumming are done are
> actually higher chance than 1/64k. But even it actually is 1/64k, it's not
> great to wait for (#animals * #catalog-changes) to approach a decent
> percentage of 1/64k.

Exactly.

> I'm was curious whether there have been similar issues in the past. Querying
> the buildfarm logs suggests not, at least not in the pg_checksums test.

That test has only been there since 2018 (b34e84f16).  We've probably
accumulated a couple hundred initial-catalog-contents changes since
then, so maybe this failure arrived right on schedule :-(.

> We really ought to find a way to get to wider checksums :/

That'll just reduce the probability of failure, not eliminate it.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: pg_relation_size on partitioned table
Next
From: Tom Lane
Date:
Subject: Re: identifying unrecognized node type errors