Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id 20180408023337.GA21781@momjian.us
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
List pgsql-hackers
On Sun, Apr  8, 2018 at 02:16:07PM +1200, Thomas Munro wrote:
> So, what can we actually do about this new Linux behaviour?
> 
> Idea 1:
> 
> * whenever you open a file, either tell the checkpointer so it can
> open it too (and wait for it to tell you that it has done so, because
> it's not safe to write() until then), or send it a copy of the file
> descriptor via IPC (since duplicated file descriptors share the same
> f_wb_err)
> 
> * if the checkpointer can't take any more file descriptors (how would
> that limit even work in the IPC case?), then it somehow needs to tell
> you that so that you know that you're responsible for fsyncing that
> file yourself, both on close (due to fd cache recycling) and also when
> the checkpointer tells you to
> 
> Maybe it could be made to work, but sheesh, that seems horrible.  Is
> there some simpler idea along these lines that could make sure that
> fsync() is only ever called on file descriptors that were opened
> before all unflushed writes, or file descriptors cloned from such file
> descriptors?
> 
> Idea 2:
> 
> Give up, complain that this implementation is defective and
> unworkable, both on POSIX-compliance grounds and on POLA grounds, and
> campaign to get it fixed more fundamentally (actual details left to
> the experts, no point in speculating here, but we've seen a few
> approaches that work on other operating systems including keeping
> buffers dirty and marking the whole filesystem broken/read-only).
> 
> Idea 3:
> 
> Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.

Idea 4 would be for people to assume their database is corrupt if their
server logs report any I/O error on the file systems Postgres uses.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Next
From: Christophe Pettus
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS