Home > mailing lists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From	Craig Ringer
Subject	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date	March 29, 2018 16:15:10
Msg-id	CAMsr+YH8JP-UdsGt0dLMcDRx6WQ78BZA7kMgimu8+ZuB_uzyFQ@mail.gmail.com Whole thread
In response to	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Thomas Munro <thomas.munro@enterprisedb.com>)
List	pgsql-hackers

Tree view

On 29 March 2018 at 20:07, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 28 March 2018 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Craig Ringer <craig@2ndquadrant.com> writes:
>> > TL;DR: Pg should PANIC on fsync() EIO return.
>>
>> Surely you jest.
>
> No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as
> well to avoid similar lost-page-write issues.

I found your discussion with kernel hacker Jeff Layton at
https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
writeup seems to want a scheme where pages stay dirty after a
writeback failure so that we can try to fsync them again. Note that
that has never been the case in Linux after hard writeback failures,
AFAIK, so programs should definitely not assume that behavior."

The article above that says the same thing a couple of different ways,
ie that writeback failure leaves you with pages that are neither
written to disk successfully nor marked dirty.

If I'm reading various articles correctly, the situation was even
worse before his errseq_t stuff landed. That fixed cases of
completely unreported writeback failures due to sharing of PG_error
for both writeback and read errors with certain filesystems, but it
doesn't address the clean pages problem.

Yeah, I see why you want to PANIC.

In more ways than one ;)

> I'm not seeking to defend what the kernel seems to be doing. Rather, saying
> that we might see similar behaviour on other platforms, crazy or not. I
> haven't looked past linux yet, though.

I see no reason to think that any other operating system would behave
that way without strong evidence... This is openly acknowledged to be
"a mess" and "a surprise" in the Filesystem Summit article. I am not
really qualified to comment, but from a cursory glance at FreeBSD's
vfs_bio.c I think it's doing what you'd hope for... see the code near
the comment "Failed write, redirty."

Ok, that's reassuring, but doesn't help us on the platform the great majority of users deploy on :(

"If on Linux, PANIC"

Hrm.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Alexander Korotkov
Date: 29 March 2018, 16:10:24
Subject: Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

From: Tom Lane
Date: 29 March 2018, 16:26:50
Subject: Re: Parallel safety of binary_upgrade_create_empty_extension

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

Previous

Next