Home > mailing lists

Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date	January 8, 2021 01:03:59
Msg-id	CAH2-Wzk2+M_=MuUGHJnWxCSfFxNt-3mqt02KTL92qLuqKtyxng@mail.gmail.com Whole thread Raw
In response to	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) (Stephen Frost <sfrost@snowman.net>)
Responses	Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
List	pgsql-hackers

Tree view

On Thu, Jan 7, 2021 at 1:14 PM Stephen Frost <sfrost@snowman.net> wrote:
> I expected there'd be some disagreement on this, but I do continue to
> feel that it's sensible to enable checksums by default.  I also don't
> think there's anything particularly wrong with such a difference of
> opinion, though it likely means that we're going to continue on with the
> status quo- where, certainly, very many deployments enable it even
> though the upstream default is to have it disabled.

I agree with all that.

> This certainly
> isn't the only place that's done, though we've been working to improve
> that situation with things like trying to get rid of 'trust' being used
> in our default pg_hba.conf.

That seems like an easier case to make to me.

> Short answer is 'yes', as mentioned down-thread and having checksums was
> a pre-requisite to deploying PG in RDS (or so folks very involved in RDS
> have told me previously- and I'll also note that it was 9.3 that was
> first deployed as part of RDS).  I don't think there's any question that
> they're using --data-checksums and that it is, in fact, the actual
> original PG checksum code (or at least was at 9.3, though I've further
> heard comments that they actively try to minimize the delta between RDS
> and PG).

I accept that.

> Nope, the risk from not having fsync was clearly understood, and still
> is, to be a larger risk than not having checksums.  That doesn't mean
> there's no risk to not having checksums or that we simply shouldn't
> consider checksums to be worthwhile or that we shouldn't have them on by
> default.  I outlined them together in that they're both there to address
> the risk that "something doesn't go right", but, as I said previously
> and again above, the level of risk between the two isn't the same.  That
> doesn't mean we shouldn't consider that checksums *do* address a risk
> and consider enabling them by default- even with the performance impact
> that they have today.

Fair.

> Much of this line of discussion seems to be, incorrectly, focused on my
> mere mention of viewing the use of fsync and checksums as mechanism for
> addressing certain risks, but that doesn't seem to be a terribly
> fruitful direction to be going in.  I'm not suggesting that we should go
> turn off fsync by default simply because we don't have checksums on by
> default, which seems to be the implication.

I admit that I saw red. This was a direct result of your bogus
argument, which greatly overstated the case in favor of enabling
checksums by default. I regret my role in that now, though. It would
be good to debate the actual issue, but that isn't what I saw.
Everyone knows the principles behind checksums and how they're useful
-- it doesn't need to be a part of the discussion.

I think that it should be possible to make a much better case in favor
of enabling checksums by default. On further reflection I actually
don't think that the real-world VACUUM overhead is anything like 15x,
though the details are complex. I might be willing to help with this
analysis, but since you only seem to want to discuss the question in a
narrow way (e.g. "I agree that improving compression performance would
be good but I don't see that as relevant to the question of what our
defaults should be"), I have to wonder if it's worth the trouble.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Josef Šimánek
Date: 08 January 2021, 01:00:24
Subject: Re: [PATCH] Simple progress reporting for COPY command

From: Peter Geoghegan
Date: 08 January 2021, 02:07:56
Subject: Re: Deleting older versions in unique indexes to avoid page splits

Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

Previous

Next