Plug-pull testing worked, diskchecker.pl failed - Mailing list pgsql-general

From Chris Angelico
Subject Plug-pull testing worked, diskchecker.pl failed
Date
Msg-id CAPTjJmrDuEjvBo00XvLMLOQ5vdXfFo33K1Gmvzvj0KiiahnjEQ@mail.gmail.com
Whole thread Raw
Responses Re: Plug-pull testing worked, diskchecker.pl failed  (Jeff Janes <jeff.janes@gmail.com>)
Re: Plug-pull testing worked, diskchecker.pl failed  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-general
After reading the comments last week about SSDs, I did some testing of
the ones we have at work - each of my test-boxes (three with SSDs, one
with HDD) subjected to multiple stand-alone plug-pull tests, using
pgbench to provide load. So far, there've been no instances of
PostgreSQL data corruption, but diskchecker.pl reported huge numbers
of errors.

What exactly does this mean? Is Postgres doing something that
diskchecker isn't, and is thus safe? Could data corruption occur but
I've just never pulled the power out at the precise microsecond when
it would cause problems? Or is it that we would lose entire
transactions, but never experience corruption that the postmaster
can't repair?

Interestingly, disabling write-caching with 'hdparm -W 0 /dev/sda' (as
per the llivejournal blog[1]) reduced the SSD's error rates without
eliminating failures entirely, while on the HDD, there were no
problems at all with write caching off.

ChrisA


pgsql-general by date:

Previous
From: "Albe Laurenz"
Date:
Subject: Re: Revert TRUNCATE CASCADE?
Next
From: chinnaobi
Date:
Subject: Re: Streaming replication failed to start scenarios