Re: O_DSYNC broken on MacOS X? - Mailing list pgsql-hackers
From | Darren Duncan |
---|---|
Subject | Re: O_DSYNC broken on MacOS X? |
Date | |
Msg-id | 4CA54F79.6030508@darrenduncan.net Whole thread Raw |
In response to | Re: O_DSYNC broken on MacOS X? (Greg Smith <greg@2ndquadrant.com>) |
Responses |
Re: O_DSYNC broken on MacOS X?
|
List | pgsql-hackers |
Greg Smith wrote: > You didn't quote the next part of that, which says "fsync() is not > sufficient to guarantee that your data is on stable > storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC, to ask > the drive to flush all buffered data to stable storage." That's exactly > what turning on fsync_writethrough does in PostgreSQL. See > http://archives.postgresql.org/pgsql-hackers/2005-04/msg00390.php as the > first post on this topic that ultimately led to that behavior being > implemented. > > From the perspective of the database, whether or not the behavior is > standards compliant isn't the issue. Whether pages make it to physical > disk or not when fsync is called, or when O_DSYNC writes are done on > platforms that support them, is the important part. If you the OS > doesn't do that, it is doing nothing useful from the perspective of the > database's expectations. And that's not true on Darwin unless you > specify F_FULLFSYNC, which doesn't happen by default in PostgreSQL. It > only does that when you switch wal_sync_method=fsync_writethrough Greg Smith also wrote: > The main downside to switching the default on either OS X or Windows is developers using those platforms for test deployments will suffer greatly from a performance drop for data they don't really care about. As those two in particular are much more likely to be client development platforms, too, that's a scary thing to consider. I think that, bottom line, Postgres should be defaulting to whatever the safest and most reliable behavior is, per each platform, because data integrity is the most important thing, ensuring that a returning commit has actually written data to disk. If performance is worse, then so what? Code that does nothing has the best performance of all, and is also generally useless. Whenever there is a tradeoff to be made, reliability for speed, then users should have to explicitly choose the less reliable option, which would demonstrate they know what they're doing. Let the testers explicitly choose a faster and less reliable option for the data they don't care about, and otherwise by default users who don't better should get the safest option, for data they likely care about. That is a DBMS priority. This matter reminds me of a discussion on the SQLite list years ago about whether pragma synchronous=normal or synchronous=full should be the default, and thankfully 'full' won. -- Darren Duncan
pgsql-hackers by date: