Home > mailing lists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From	Craig Ringer
Subject	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date	March 29, 2018 11:25:51
Msg-id	CAMsr+YEa4tv1UCBRQHzA1ycfdvryHFYJ1LhaJJNbjStO3=M9Hg@mail.gmail.com Whole thread Raw
In response to	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Gasper Zejn <zejn@owca.info>)
List	pgsql-hackers

Tree view

On 29 March 2018 at 13:06, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> The retries are the source of the problem ; the first fsync() can return EIO,
> and also *clears the error* causing a 2nd fsync (of the same data) to return
> success.

What I'm failing to grok here is how that error flag even matters,
whether it's a single bit or a counter as described in that patch. If
write back failed, *the page is still dirty*. So all future calls to
fsync() need to try to try to flush it again, and (presumably) fail
again (unless it happens to succeed this time around).

You'd think so. But it doesn't appear to work that way. You can see yourself with the error device-mapper destination mapped over part of a volume.

I wrote a test case here.

https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c

I don't pretend the kernel behaviour is sane. And it's possible I've made an error in my analysis. But since I've observed this in the wild, and seen it in a test case, I strongly suspect that's what I've described is just what's happening, brain-dead or no.

Presumably the kernel marks the page clean when it dispatches it to the I/O subsystem and doesn't dirty it again on I/O error? I haven't dug that deep on the kernel side. See the stackoverflow post for details on what I found in kernel code analysis.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Thomas Munro
Date: 29 March 2018, 11:06:22
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Craig Ringer
Date: 29 March 2018, 11:32:43
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

Previous

Next