Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date
Msg-id CAH2-Wzk=iABDv_q-jrC9PaeC2sRamvy=KyNTNArWNnW7DCWdEQ@mail.gmail.com
Whole thread Raw
In response to Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Stephen Frost <sfrost@snowman.net>)
Responses Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Michael Banck <michael.banck@credativ.de>)
Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Wed, Jan 6, 2021 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:
> As already mentioned, it's also, at least today, far
> simpler to disable checksums than to enable them, which is something
> else to consider when thinking about what the default should be.

That is a valid concern. I just don't think that it's good enough on
its own, given the overwhelming downside of enabling checksums given
the WAL architecture that we have today.

> That the major cloud providers all have checksums enabled (at least by
> default, though I wonder if they would even let you turn them off..),
> even when we don't have them on by default, strikes me as pretty telling
> that this is something that we should have on by default.

Please provide supporting evidence. I know that EBS itself uses
checksums at the block device level, so I'm sure that RDS "uses
checksums" in some sense. But does RDS use --data-checksums during
initdb?

> Certainly there's a different risk profile between the two and there may
> be times when someone is fine with running without fsync, or fine
> running without checksums, but those are, in my view, exceptions made
> once you understand exactly what risk you're willing to accept, and not
> what the default or typical deployment should be.

Okay, I'll bite. Here is the important difference: Enabling checksums
doesn't actually make data corruption less likely, it just makes it
easier to detect. Whereas disabling fsync will reliably produce
corruption before too long in almost any installation. It may
occasionally be appropriate to disable fsync in a very controlled
environment, but it's rare, and not much faster than disabling
synchronous commits in any case. It barely ever happens.

We added page-level checksums in 9.3. Can you imagine a counterfactual
history in which Postgres had page checksums since the 1990s, but only
added the fsync feature in 9.3? Please answer this non-rhetorical
question.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: psql \df choose functions by their arguments
Next
From: Michael Banck
Date:
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)