Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date
Msg-id 37c1f3d9-eaa1-4cad-b87e-e811e3b07ef3@www.fastmail.com
Whole thread Raw
In response to Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (David Steele <david@pgmasters.net>)
List pgsql-hackers
Hi,

On Fri, Jan 8, 2021, at 01:53, Laurenz Albe wrote:
> On Thu, 2021-01-07 at 16:14 -0500, Stephen Frost wrote:
> > I expected there'd be some disagreement on this, but I do continue to
> > feel that it's sensible to enable checksums by default.
> 
> +1

I don't disagree with this in principle, but if you want that you need to work on making checksum overhead far smaller.
That'sdoable. Afterwards it makes sense to have this discussion.
 

> I think the problem here (apart from the original line of argumentation)
> is that there are two kinds of PostgreSQL installations:
> 
> - installations done on dubious hardware with minimal tuning
>   (the "cheap crowd")
> 
> - installations done on good hardware, where people make an effort to
>   properly configure the database (the "serious crowd")
> 
> I am aware that this is an oversimplification for the sake of the argument.
> 
> The voices against checksums on by default are probably thinking of
> the serious crowd.
> 
> If checksums were enabled by default, the cheap crowd would benefit
> from the early warnings that something has gone wrong.
> 
> The serious crowd are more likely to choose a non-default setting
> to avoid paying the price for a feature that they don't need.

I don't really buy this argument. That way we're going to have an ever growing set of things that need to be tuned to
havea database that's usable in an even halfway busy setup. That's unavoidable in some cases, but it's a significant
costacross use cases.
 

Increasing the overhead in the default config from one version to the next isn't great - it makes people more hesitant
toupgrade. It's also not a cost you're going to find all that quickly, and it's a really hard to pin down cost.
 


Andres




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Masahiko Sawada
Date:
Subject: Re: Disable WAL logging to speed up data loading