Home > mailing lists

Re: Checksum errors in pg_stat_database - Mailing list pgsql-hackers

From	Magnus Hagander
Subject	Re: Checksum errors in pg_stat_database
Date	December 12, 2022 09:33:14
Msg-id	CABUevEx1Jv=kFW=6Kth_nEtbqP+heo1Okv8eRYD57MSzMKbE_A@mail.gmail.com Whole thread
In response to	Re: Checksum errors in pg_stat_database (Michael Paquier <michael@paquier.xyz>)
List	pgsql-hackers

Tree view

On Mon, Dec 12, 2022 at 12:40 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Dec 11, 2022 at 09:18:42PM +0100, Magnus Hagander wrote:
> It would be less of a concern yes, but I think it still would be a concern.
> If you have a large amount of corruption you could quickly get to millions
> of rows to keep track of which would definitely be a problem in shared
> memory as well, wouldn't it?

Yes. I have discussed this item with Bertrand off-list and I share
the same concern. This would lead to an lot of extra workload on a
large seqscan for a corrupted relation when the stats are written
(shutdown delay) while bloating shared memory with potentially
millions of items even if variable lists are handled through a dshash
and DSM.

> But perhaps we could keep a list of "the last 100 checksum failures" or
> something like that?

Applying a threshold is one solution. Now, a second thing I have seen
in the past is that some disk partitions were busted but not others,
and the current database-level counters are not enough to make a
difference when it comes to grab patterns in this area. A list of the
last N failures may be able to show some pattern, but that would be
like analyzing things with a lot of noise without a clear conclusion.

Anyway, the workload caused by the threshold number had better be
measured before being decided (large set of relation files with a full
range of blocks corrupted, much better if these are in the OS cache
when scanned), which does not change the need of a benchmark.

What about just adding a counter tracking the number of checksum
failures for relfilenodes in a new structure related to them (note
that I did not write PgStat_StatTabEntry)?

If we do that, it is then possible to cross-check the failures with
tablespaces, which would point to disk areas that are more sensitive
to corruption.

If that's the concern, then perhaps the level we should be tracking things on is tablespace? We don't have any stats per tablespace today I believe, but that doesn't mean we couldn't create that.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

pgsql-hackers by date:

From: "Takamichi Osumi (Fujitsu)"
Date: 12 December 2022, 07:42:30
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)

From: Antonin Houska
Date: 12 December 2022, 09:44:39
Subject: Re: refactor ExecGrant_*() functions

Re: Checksum errors in pg_stat_database - Mailing list pgsql-hackers

Previous

Next