Home > mailing lists

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From	Gasper Zejn
Subject	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date	April 4, 2018 23:23:58
Msg-id	7fc5f714-1864-f928-851b-87fc4218c72a@owca.info Whole thread Raw
In response to	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Bruce Momjian <bruce@momjian.us>)
List	pgsql-hackers

Tree view


On 04. 04. 2018 15:49, Bruce Momjian wrote:
> I can understand why kernel developers don't want to keep failed sync
> buffers in memory, and once they are gone we lose reporting of their
> failure.  Also, if the kernel is going to not retry the syncs, how long
> should it keep reporting the sync failure?  To the first fsync that
> happens after the failure?  How long should it continue to record the
> failure?  What if no fsync() every happens, which is likely for
> non-Postgres workloads?  I think once they decided to discard failed
> syncs and not retry them, the fsync behavior we are complaining about
> was almost required.
Ideally the kernel would keep its data for as little time as possible.
With fsync, it doesn't really know which process is interested in
knowing about a write error, it just assumes the caller will know how to
deal with it. Most unfortunate issue is there's no way to get
information about a write error.

Thinking aloud - couldn't/shouldn't a write error also be a file system
event reported by inotify? Admittedly that's only a thing on Linux, but
still.


Kind regards,
Gasper

pgsql-hackers by date:

From: Alvaro Herrera
Date: 04 April 2018, 23:21:14
Subject: Re: Foreign keys and partitioned tables

From: Magnus Hagander
Date: 04 April 2018, 23:25:11
Subject: Re: pgsql: Validate page level checksums in base backups

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

Previous

Next