Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help) - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date
Msg-id 20210106180159.GM27507@tamriel.snowman.net
Whole thread Raw
In response to Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Andres Freund <andres@anarazel.de>)
Responses Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2021-01-06 12:02:40 -0500, Stephen Frost wrote:
> > * Andres Freund (andres@anarazel.de) wrote:
> > > On 2021-01-04 19:11:43 +0100, Michael Banck wrote:
> > > > Am Samstag, den 02.01.2021, 10:47 -0500 schrieb Stephen Frost:
> > > > > I agree with this, but I'd also like to propose, again, as has been
> > > > > discussed a few times, making it the default too.
> > >
> > > FWIW, I am quite doubtful we're there performance-wise. Besides the WAL
> > > logging overhead, the copy we do via PageSetChecksumCopy() shows up
> > > quite significantly in profiles here. Together with the checksums
> > > computation that's *halfing* write throughput on fast drives in my aio
> > > branch.
> >
> > Our defaults are not going to win any performance trophies and so I
> > don't see the value in stressing over it here.
>
> Meh^3. There's a difference between defaults that are about resource
> usage (e.g. shared_buffers) and defaults that aren't.

fsync isn't about resource usage.

> > > > This looks much better from the WAL size perspective, there's now almost
> > > > no additional WAL. However, that is because pgbench doesn't do TOAST, so
> > > > in a real-world example it might still be quite larger. Also, the vacuum
> > > > runtime is still 15x longer.
> > >
> > > That's obviously an issue.
> >
> > It'd certainly be nice to figure out a way to improve the VACUUM run but
> > I don't think the impact on the time to run VACUUM is really a good
> > reason to not move forward with changing the default.
>
> Vacuum performance is one of *THE* major complaints about
> postgres. Making it run slower by a lot obviously exascerbates that
> problem significantly. I think it'd be prohibitively expensive if it
> were 1.5x, not to even speak of 15x.

We already make vacuum, when run out of autovacuum, relatively slow,
quite intentionally.  If someone's having trouble with vacuum run times
they're going to be adjusting the configuration anyway.

> > imv, enabling page checksums is akin to having fsync enabled by default.
> > Does it impact performance?  Yes, surely quite a lot, but it's also the
> > safe and sane choice when it comes to defaults.
>
> Oh for crying out loud.

Not sure what you're hoping to gain from such comments, but it doesn't
do anything to change my opinion.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Next
From: Stephen Frost
Date:
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)