Home > mailing lists

Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date	January 6, 2021 20:56:16
Msg-id	CAH2-Wzk=iABDv_q-jrC9PaeC2sRamvy=KyNTNArWNnW7DCWdEQ@mail.gmail.com Whole thread Raw
In response to	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) (Stephen Frost <sfrost@snowman.net>)
Responses	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
List	pgsql-hackers

Tree view

On Wed, Jan 6, 2021 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:
> As already mentioned, it's also, at least today, far
> simpler to disable checksums than to enable them, which is something
> else to consider when thinking about what the default should be.

That is a valid concern. I just don't think that it's good enough on
its own, given the overwhelming downside of enabling checksums given
the WAL architecture that we have today.

> That the major cloud providers all have checksums enabled (at least by
> default, though I wonder if they would even let you turn them off..),
> even when we don't have them on by default, strikes me as pretty telling
> that this is something that we should have on by default.

Please provide supporting evidence. I know that EBS itself uses
checksums at the block device level, so I'm sure that RDS "uses
checksums" in some sense. But does RDS use --data-checksums during
initdb?

> Certainly there's a different risk profile between the two and there may
> be times when someone is fine with running without fsync, or fine
> running without checksums, but those are, in my view, exceptions made
> once you understand exactly what risk you're willing to accept, and not
> what the default or typical deployment should be.

Okay, I'll bite. Here is the important difference: Enabling checksums
doesn't actually make data corruption less likely, it just makes it
easier to detect. Whereas disabling fsync will reliably produce
corruption before too long in almost any installation. It may
occasionally be appropriate to disable fsync in a very controlled
environment, but it's rare, and not much faster than disabling
synchronous commits in any case. It barely ever happens.

We added page-level checksums in 9.3. Can you imagine a counterfactual
history in which Postgres had page checksums since the 1990s, but only
added the fsync feature in 9.3? Please answer this non-rhetorical
question.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Greg Sabino Mullane
Date: 06 January 2021, 20:48:14
Subject: Re: psql \df choose functions by their arguments

From: Michael Banck
Date: 06 January 2021, 21:04:13
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)

Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

Previous

Next