Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id CAMsr+YGHyJ61J6NithMFum63N_BVWAfWUhSEONuu6C_awiSvQg@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
On 10 April 2018 at 20:15, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 10 April 2018 at 14:10, Michael Paquier <michael@paquier.xyz> wrote:
>
>> Well, I think that there is place for improving reporting of failure
>> in file_utils.c for frontends, or at worst have an exit() for any kind
>> of critical failures equivalent to a PANIC.
>
> Yup.
>
> In the mean time, speaking of PANIC, here's the first cut patch to
> make Pg panic on fsync() failures. I need to do some closer review and
> testing, but it's presented here for anyone interested.
>
> I intentionally left some failures as ERROR not PANIC, where the
> entire operation is done as a unit, and an ERROR will cause us to
> retry the whole thing.
>
> For example, when we fsync() a temp file before we move it into place,
> there's no point panicing on failure, because we'll discard the temp
> file on ERROR and retry the whole thing.
>
> I've verified that it works as expected with some modifications to the
> test tool I've been using (pushed).
>
> The main downside is that if we panic in redo, we don't try again. We
> throw our toys and shut down. But arguably if we get the same I/O
> error again in redo, that's the right thing to do anyway, and quite
> likely safer than continuing to ERROR on checkpoints indefinitely.
>
> Patch attached.
>
> To be clear, this patch only deals with the issue of us retrying
> fsyncs when it turns out to be unsafe. This does NOT address any of
> the issues where we won't find out about writeback errors at all.

Thinking about this some more, it'll definitely need a GUC to force it
to continue despite a potential hazard. Otherwise we go backwards from
the status quo if we're in a position where uptime is vital and
correctness problems can be tolerated or repaired later. Kind of like
zero_damaged_pages, we'll need some sort of
continue_after_fsync_errors .

Without that, we'll panic once, enter redo, and if the problem
persists we'll panic in redo and exit the startup process. That's not
going to help users.

I'll amend the patch accordingly as time permits.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Yuriy Zhuravlev
Date:
Subject: Re: Setting rpath on llvmjit.so?
Next
From: Amit Langote
Date:
Subject: Re: Boolean partitions syntax