Re: Enable data checksums by default - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Enable data checksums by default
Date
Msg-id CABUevEzpZO7_fu9ihPZH2wQykSBPOne0oAC9EqZLLB_u=xwEnw@mail.gmail.com
Whole thread Raw
In response to Re: Enable data checksums by default  (Christoph Berg <myon@debian.org>)
List pgsql-hackers


On Mon, Apr 1, 2019 at 10:17 AM Christoph Berg <myon@debian.org> wrote:
Re: Tomas Vondra 2019-03-30 <20190330192543.GH4719@development>
> I have not investigated the exact reasons, but my hypothesis it's about
> the amount of WAL generated during the initial CREATE INDEX (because it
> probably ends up setting the hint bits), which puts additional pressure
> on the storage.
>
> Unfortunately, this additional cost is unlikely to go away :-(

If WAL volume is a problem, would wal_compression help?

> Now, maybe we want to enable checksums by default anyway, but we should
> not pretent the only cost related to checksums is CPU usage.

Thanks for doing these tests. The point I'm trying to make is, why do
we run without data checksums by default? For example, we do checksum
the WAL all the time, and there's not even an option to disable it,
even if that might make things faster. Why don't we enable data
checksums by default as well?

I think one of the often overlooked original reasons was that we need to log hint bits, same as when wal_log_hints is set.

Of course, if we consider it today, you have to do that in order to use pg_rewind as well, so a lot of people who want to run any form of HA setup will be having that turned on anyway. I think that has turned out to be a much weaker reason than it originally was thought to be. 

--

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid
Next
From: Amit Langote
Date:
Subject: Re: speeding up planning with partitions