Home > mailing lists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From	Craig Ringer
Subject	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date	April 9, 2018 04:35:06
Msg-id	CAMsr+YGFqgMWdfM2kOMCRyMofYe8zwsEuHZ3vc+rzSZpPty0Eg@mail.gmail.com Whole thread
In response to	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Bruce Momjian <bruce@momjian.us>)
List	pgsql-hackers

Tree view

On 9 April 2018 at 06:29, Bruce Momjian <bruce@momjian.us> wrote:

I think the big problem is that we don't have any way of stopping
Postgres at the time the kernel reports the errors to the kernel log, so
we are then returning potentially incorrect results and committing
transactions that might be wrong or lost.

Right.

Specifically, we need a way to ask the kernel at checkpoint time "was everything written to [this set of files] flushed successfully since the last time I asked, no matter who did the writing and no matter how the writes were flushed?"

If the result is "no" we PANIC and redo. If the hardware/volume is screwed, the user can fail over to a standby, do PITR, etc.

But we don't have any way to ask that reliably at present.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Craig Ringer
Date: 09 April 2018, 04:31:56
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Andres Freund
Date: 09 April 2018, 04:55:10
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

Previous

Next