Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id c60ac320-7390-80c4-d012-5b149d59aa40@2ndquadrant.com
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Andres Freund <andres@anarazel.de>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On 04/09/2018 09:37 PM, Andres Freund wrote:
> 
> 
> On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos <ailiop@altatus.com> wrote:
> 
>> I honestly do not expect that keeping around the failed pages will
>> be an acceptable change for most kernels, and as such the
>> recommendation
>> will probably be to coordinate in userspace for the fsync().
> 
> Why is that required? You could very well just keep per inode 
> information about fatal failures that occurred around. Report errors 
> until that bit is explicitly cleared. Yes, that keeps some memory
> around until unmount if nobody clears it. But it's orders of
> magnitude less, and results in usable semantics.
>

Isn't the expectation that when a fsync call fails, the next one will
retry writing the pages in the hope that it succeeds?

Of course, it's also possible to do what you suggested, and simply mark
the inode as failed. In which case the next fsync can't possibly retry
the writes (e.g. after freeing some space on thin-provisioned system),
but we'd get reliable failure mode.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Anthony Iliopoulos
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Next
From: Andres Freund
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS