Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id CAMsr+YG-aP_LQ8muweGEyroOggEx5wVD2sFup9VmQGrkD0ZAAA@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
List pgsql-hackers
On 4 April 2018 at 22:00, Craig Ringer <craig@2ndquadrant.com> wrote:
 
It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly.


Also, just to be clear, this is not in any way confined to xfs and/or lvm as I originally thought it might be. 

Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't help either (so what does it do?).

What bewilders me is that running with data=journal doesn't seem to be safe either. WTF? 

[26438.846111] EXT4-fs (dm-0): mounted filesystem with journalled data mode. Opts: errors=remount-ro,data_err=abort,data=journal
[26454.125319] EXT4-fs warning (device dm-0): ext4_end_bio:323: I/O error 10 writing to inode 12 (offset 0 size 0 starting block 59393)
[26454.125326] Buffer I/O error on device dm-0, logical block 59393
[26454.125337] Buffer I/O error on device dm-0, logical block 59394
[26454.125343] Buffer I/O error on device dm-0, logical block 59395
[26454.125350] Buffer I/O error on device dm-0, logical block 59396

and splat, there goes your data anyway.

It's possible that this is in some way related to using the device-mapper "error" target and a loopback device in testing. But I don't really see how.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: json(b)_to_tsvector with numeric values
Next
From: Stephen Frost
Date:
Subject: Re: Rewrite of pg_dump TAP tests