Re: pgsql-server/src backend/storage/buffer/bufmgr ... - Mailing list pgsql-committers

From Tom Lane
Subject Re: pgsql-server/src backend/storage/buffer/bufmgr ...
Date
Msg-id 14205.1075146232@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql-server/src backend/storage/buffer/bufmgr ...  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: pgsql-server/src backend/storage/buffer/bufmgr ...
Re: pgsql-server/src backend/storage/buffer/bufmgr ...
List pgsql-committers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> As I've said before, I think we need to find a way to stop using sync()
>> altogether --- we have to move to fsync or O_SYNC and variants.  sync
>> has simply got the wrong API.

> If sync failes (kernel to disk write failes) we have a hardware failure,
> and we don't pretend to recover from that,

Not necessarily --- it could be out-of-disk-space, on at least some
filesystems.  More to the point, the important thing is not to commit a
checkpoint record to WAL indicating that everything is good, when
everything is not good.  As long as we don't checkpoint we have some
hope of recovering automatically via WAL replay.

> One idea I floated around was to
> open/write/fsync/close a temporary file after sync in the hope that it
> would happen after the sync completes because the fsync would be at the
> end of the disk flush queue.

"In the hope"?  We already have a guess-and-hope approach to this, and
it will never be any better as long as we use sync(), because sync() is
fundamentally the wrong operation.  It doesn't tell you when the I/O is
done, and it doesn't tell you whether the I/O was done successfully, and
there is no possibility of working around that fundamental lack of
information except to stop using it.

            regards, tom lane

pgsql-committers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pgsql-server/src backend/storage/buffer/bufmgr ...
Next
From: Bruce Momjian
Date:
Subject: Re: pgsql-server/src backend/storage/buffer/bufmgr ...