Home > mailing lists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date	April 24, 2018 03:09:23
Msg-id	20180424000923.GC12787@momjian.us Whole thread Raw
In response to	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Andres Freund <andres@anarazel.de>)
Responses	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
List	pgsql-hackers

Tree view

On Mon, Apr 23, 2018 at 01:14:48PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:
> > TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at
> > least on Linux. When fsync() returns success it means "all writes since the
> > last fsync have hit disk" but we assume it means "all writes since the last
> > SUCCESSFUL fsync have hit disk".
> 
> > But then we retried the checkpoint, which retried the fsync(). The retry
> > succeeded, because the prior fsync() *cleared the AS_EIO bad page flag*.
> 
> Random other thing we should look at: Some filesystems (nfs yes, xfs
> ext4 no) flush writes at close(2). We check close() return code, just
> log it... So close() counts as an fsync for such filesystems().

Well, that's interesting.  You might remember that NFS does not reserve
space for writes like local file systems like ext4/xfs do.  For that
reason, we might be able to capture the out-of-space error on close and
exit sooner for NFS.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

pgsql-hackers by date:

From: Bruce Momjian
Date: 24 April 2018, 02:59:34
Subject: Re: Built-in connection pooling

From: Amit Langote
Date: 24 April 2018, 03:23:03
Subject: Re: Boolean partitions syntax

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

Previous

Next