Re: better page-level checksums - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: better page-level checksums
Date
Msg-id CAH2-WznHpcYHE=Rz2-KQu+QXaVAvnDU0R-qqeXVZRWnp=QHUbQ@mail.gmail.com
Whole thread Raw
In response to Re: better page-level checksums  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Jun 14, 2022 at 7:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 14, 2022 at 9:56 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Technically we don't already do that today, with the 16-bit checksums
> > that are stored in PageHeaderData.pd_checksum. But we do something
> > equivalent: low-level tools can still infer that checksums must not be
> > enabled on the page (really the cluster) indirectly in the event of a
> > 0 checksum. A 0 value can reasonably be interpreted as a page from a
> > cluster without checksums (barring page corruption). This is basically
> > reasonable because our implementation of checksums is guaranteed to
> > not generate 0 as a valid checksum value.
>
> I don't think that 'pg_checksums -d' zeroes the checksum values on the
> pages in the cluster.

Obviously there are limitations on when and how we can infer something
about the whole cluster based on one single page image -- it all
depends on the context. I'm only arguing that we ought to make this
kind of analysis as easy as we reasonably can. I just don't see any
downside to having a status bit per checksum or encryption algorithm
at the page level, and plenty of upside (especially if the event of
bugs).

This seems like the absolute bare minimum to me, and I'm genuinely
surprised that there is even a question about whether or not we should
do that much.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: better page-level checksums
Next
From: Amit Langote
Date:
Subject: Re: Replica Identity check of partition table on subscriber