Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) |
Date | |
Msg-id | CAH2-Wzk2+M_=MuUGHJnWxCSfFxNt-3mqt02KTL92qLuqKtyxng@mail.gmail.com Whole thread Raw |
In response to | Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
|
List | pgsql-hackers |
On Thu, Jan 7, 2021 at 1:14 PM Stephen Frost <sfrost@snowman.net> wrote: > I expected there'd be some disagreement on this, but I do continue to > feel that it's sensible to enable checksums by default. I also don't > think there's anything particularly wrong with such a difference of > opinion, though it likely means that we're going to continue on with the > status quo- where, certainly, very many deployments enable it even > though the upstream default is to have it disabled. I agree with all that. > This certainly > isn't the only place that's done, though we've been working to improve > that situation with things like trying to get rid of 'trust' being used > in our default pg_hba.conf. That seems like an easier case to make to me. > Short answer is 'yes', as mentioned down-thread and having checksums was > a pre-requisite to deploying PG in RDS (or so folks very involved in RDS > have told me previously- and I'll also note that it was 9.3 that was > first deployed as part of RDS). I don't think there's any question that > they're using --data-checksums and that it is, in fact, the actual > original PG checksum code (or at least was at 9.3, though I've further > heard comments that they actively try to minimize the delta between RDS > and PG). I accept that. > Nope, the risk from not having fsync was clearly understood, and still > is, to be a larger risk than not having checksums. That doesn't mean > there's no risk to not having checksums or that we simply shouldn't > consider checksums to be worthwhile or that we shouldn't have them on by > default. I outlined them together in that they're both there to address > the risk that "something doesn't go right", but, as I said previously > and again above, the level of risk between the two isn't the same. That > doesn't mean we shouldn't consider that checksums *do* address a risk > and consider enabling them by default- even with the performance impact > that they have today. Fair. > Much of this line of discussion seems to be, incorrectly, focused on my > mere mention of viewing the use of fsync and checksums as mechanism for > addressing certain risks, but that doesn't seem to be a terribly > fruitful direction to be going in. I'm not suggesting that we should go > turn off fsync by default simply because we don't have checksums on by > default, which seems to be the implication. I admit that I saw red. This was a direct result of your bogus argument, which greatly overstated the case in favor of enabling checksums by default. I regret my role in that now, though. It would be good to debate the actual issue, but that isn't what I saw. Everyone knows the principles behind checksums and how they're useful -- it doesn't need to be a part of the discussion. I think that it should be possible to make a much better case in favor of enabling checksums by default. On further reflection I actually don't think that the real-world VACUUM overhead is anything like 15x, though the details are complex. I might be willing to help with this analysis, but since you only seem to want to discuss the question in a narrow way (e.g. "I agree that improving compression performance would be good but I don't see that as relevant to the question of what our defaults should be"), I have to wonder if it's worth the trouble. -- Peter Geoghegan
pgsql-hackers by date: