Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date
Msg-id 20210106203032.GR27507@tamriel.snowman.net
Whole thread Raw
In response to Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Greetings,

* Peter Geoghegan (pg@bowt.ie) wrote:
> On Wed, Jan 6, 2021 at 12:03 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Do you really believe it to be wrong?  Do we stop performing the correct
> > write calls in the correct order to the kernel with fsync being off?  If
> > the kernel actually handles all of our write calls correctly and we
> > cleanly shut down and the kernel cleanly shuts down and sync's the disks
> > before a reboot, will there be corruption from running with fsync off?
>
> This is a total straw man. Everyone understands the technical issues
> with fsync perfectly well, and everyone understands that everyone
> understands the issue, so spare me the "I'm just a humble country
> lawyer" style explanation.
>
> What you seem to be arguing is that the differences between disabling
> checksums and disabling fsync is basically quantitative, and so making
> a qualitative distinction between those two things is meaningless, and
> that it logically follows that disagreeing with you is essentially
> irresponsible. This is a tactic that would be an embarrassment to a
> high school debate team. It's below you.

I can agree that there's a usefulness in making a qualitative
distinction between them, but we're talking about a default here, not
about if we should even have these options or these capabilities or if
we should force them upon everyone or if one is somehow better or worse
than the other.  As already mentioned, it's also, at least today, far
simpler to disable checksums than to enable them, which is something
else to consider when thinking about what the default should be.

That the major cloud providers all have checksums enabled (at least by
default, though I wonder if they would even let you turn them off..),
even when we don't have them on by default, strikes me as pretty telling
that this is something that we should have on by default.

Certainly there's a different risk profile between the two and there may
be times when someone is fine with running without fsync, or fine
running without checksums, but those are, in my view, exceptions made
once you understand exactly what risk you're willing to accept, and not
what the default or typical deployment should be.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting
Next
From: Greg Sabino Mullane
Date:
Subject: Re: psql \df choose functions by their arguments