Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Christophe Pettus
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id 17132BB5-3EDC-46BB-B485-4E0685B0C619@thebuild.com
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks dataloss at least on XFS  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
> On Apr 7, 2018, at 20:27, Craig Ringer <craig@2ndQuadrant.com> wrote:
>
> Right now I think we're at option (4): If you see anything that smells like a write error in your kernel logs,
hard-killpostgres with -m immediate (do NOT let it do a shutdown checkpoint). If it did a checkpoint since the logs,
fakeup a backup label to force redo to start from the last checkpoint before the error. Otherwise, it's safe to just
letit start up again and do redo again. 

Before we spiral down into despair and excessive alcohol consumption, this is basically the same situation as a
checksumfailure or some other kind of uncorrected media-level error.  The bad part is that we have to find out from the
kernellogs rather than from PostgreSQL directly.  But this does not strike me as otherwise significantly different
from,say, an infrequently-accessed disk block reporting an uncorrectable error when we finally get around to reading
it.

--
-- Christophe Pettus
   xof@thebuild.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Next
From: Amit Kapila
Date:
Subject: Re: lazy detoasting