Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) |
Date | |
Msg-id | 20210108000816.GU27507@tamriel.snowman.net Whole thread Raw |
In response to | Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) (Peter Geoghegan <pg@bowt.ie>) |
Responses |
RE: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
|
List | pgsql-hackers |
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Thu, Jan 7, 2021 at 1:14 PM Stephen Frost <sfrost@snowman.net> wrote: > > Much of this line of discussion seems to be, incorrectly, focused on my > > mere mention of viewing the use of fsync and checksums as mechanism for > > addressing certain risks, but that doesn't seem to be a terribly > > fruitful direction to be going in. I'm not suggesting that we should go > > turn off fsync by default simply because we don't have checksums on by > > default, which seems to be the implication. > > I admit that I saw red. This was a direct result of your bogus > argument, which greatly overstated the case in favor of enabling > checksums by default. I regret my role in that now, though. It would > be good to debate the actual issue, but that isn't what I saw. > Everyone knows the principles behind checksums and how they're useful > -- it doesn't need to be a part of the discussion. I hadn't intended to make an argument that enabling checksums was equivilant to enabling or disabling fsync- I said it was 'akin', by which I meant it was similar in character, as in, as I said previously, a way for PG to hedge against certain external-to-PG risks (though, unfortunately, our checksums aren't able to actually mitigate any of the risks but merely to detect them, but there is certainly value in that too). I also now regret not being clearer as to what I meant with that comment. > I think that it should be possible to make a much better case in favor > of enabling checksums by default. On further reflection I actually > don't think that the real-world VACUUM overhead is anything like 15x, > though the details are complex. I might be willing to help with this > analysis, but since you only seem to want to discuss the question in a > narrow way (e.g. "I agree that improving compression performance would > be good but I don't see that as relevant to the question of what our > defaults should be"), I have to wonder if it's worth the trouble. What I was attempting to get at with that comment is that while I don't feel it's relevant, I wouldn't object to both being enabled by default and if those changes combined helps to get others on board with having checksums enabled by default then such an approach would also get my vote. I also doubt that VACUUM performance would be impacted as heavily in real-world workloads, but I again point out that VACUUMs, in our default configuration, is going to be run with the breaks on since it's run by autovacuum with a non-zero vacuum cost delay. While I've advocated for having that cost delay reduced (or the cost limit increased) in the past, I wouldn't support eliminating the delays entirely as that would then impact foreground activity, which is certainly where performance is more important. I appreciate that VACUUM run by an administrator directly doesn't have the breaks on, but that then is much more likely to impact foreground activity and is generally discouraged because of that- instead it's generally recommended to configure autovacuum to be more aggressive while still having a delay. Once you're past the point where you want delays to be introduced during VACUUM runs, I'd certainly think it's gone past the point where our standard defaults would be appropriate in a number of ways and a user could then consider if they want to disable checksums and accept the risk associated with doing so in favor of making VACUUM go faster, or not. Thanks, Stephen
Attachment
pgsql-hackers by date: