Re: [HACKERS] Checksums by default? - Mailing list pgsql-hackers

From Ants Aasma
Subject Re: [HACKERS] Checksums by default?
Date
Msg-id CA+CSw_vc6vMajszzDMpLOHxG=2a16SD+pKt0Us1uqUmsWnOnsg@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Checksums by default?  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
On Wed, Jan 25, 2017 at 8:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Also, it's not as if there are no other ways of checking whether your
> disks are failing.  SMART, for example, is supposed to tell you about
> incipient hardware failures before PostgreSQL ever sees a bit flip.
> Surely an average user would love to get a heads-up that their
> hardware is failing even when that hardware is not being used to power
> PostgreSQL, yet many people don't bother to configure SMART (or
> similar proprietary systems provided by individual vendors).

You really can't rely on SMART to tell you about hardware failures. 1
in 4 drives fail completely with 0 SMART indication [1]. And for the 1
in 1000 annual checksum failure rate other indicators except system
restarts only had a weak correlation[2]. And this is without
filesystem and other OS bugs that SMART knows nothing about.

My view may be biased by mostly seeing the cases where things have
already gone wrong, but I recommend support clients to turn checksums
on unless it's known that write IO is going to be an issue. Especially
because I know that if it turns out to be a problem I can go in and
quickly hack together a tool to help them turn it off. I do agree that
to change the PostgreSQL default at least some tool turn it off online
should be included.

[1] https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/
[2] https://www.usenix.org/legacy/event/fast08/tech/full_papers/bairavasundaram/bairavasundaram.pdf

Regards,
Ants Aasma



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Checksums by default?