Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id 20180417212917.GB13097@momjian.us
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
List pgsql-hackers
On Mon, Apr  9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:
> On 04/09/2018 12:29 AM, Bruce Momjian wrote:
> > 
> > An crazy idea would be to have a daemon that checks the logs and
> > stops Postgres when it seems something wrong.
> > 
> 
> That doesn't seem like a very practical way. It's better than nothing,
> of course, but I wonder how would that work with containers (where I
> think you may not have access to the kernel log at all). Also, I'm
> pretty sure the messages do change based on kernel version (and possibly
> filesystem) so parsing it reliably seems rather difficult. And we
> probably don't want to PANIC after I/O error on an unrelated device, so
> we'd need to understand which devices are related to PostgreSQL.

My more-considered crazy idea is to have a postgresql.conf setting like
archive_command that allows the administrator to specify a command that
will be run _after_ fsync but before the checkpoint is marked as
complete.  While we can have write flush errors before fsync and never
see the errors during fsync, we will not have write flush errors _after_
fsync that are associated with previous writes.

The script should check for I/O or space-exhaustion errors and return
false in that case, in which case we can stop and maybe stop and crash
recover.  We could have an exit of 1 do the former, and an exit of 2 do
the later.

Also, if we are relying on WAL, we have to make sure WAL is actually
safe with fsync, and I am betting only the O_DIRECT methods actually
are safe:

    #wal_sync_method = fsync                # the default is the first option
                                            # supported by the operating system:
                                            #   open_datasync
                                     -->    #   fdatasync (default on Linux)
                                     -->    #   fsync
                                     -->    #   fsync_writethrough
                                            #   open_sync

I am betting the marked wal_sync_method methods are not safe since there
is time between the write and fsync.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Next
From: David Rowley
Date:
Subject: Re: Append's first_partial_plan