Hi,
On Thursday, February 16, 2012 06:18:23 PM Dan Scales wrote:
> When running Postgres on a single ext3 filesystem on Linux, we find that
> the attached simple patch gives significant performance benefit (7-8% in
> numbers below). The patch adds a new option for wal_sync_method, which
> is "open_direct". With this option, the WAL is always opened with
> O_DIRECT (but not O_SYNC or O_DSYNC). For Linux, the use of only
> O_DIRECT should be correct. All WAL logs are fully allocated before
> being used, and the WAL buffers are 8K-aligned, so all direct writes are
> guaranteed to complete before returning. (See
> http://lwn.net/Articles/348739/)
I don't think that behaviour is safe in the face of write caches in the IO
path. Linux takes care to issue flush/barrier instructions when necessary if
you issue an fsync/fdatasync, but to my knowledge it does not when O_DIRECT is
used (That would suck performancewise).
I think that behaviour is safe if you have no externally visible write caching
enabled but thats not exactly easy to get/document knowledge.
Why should there otherwise be any performance difference between O_DIRECT|
O_SYNC and O_DIRECT in wal write case? There is no metadata that needs to be
written and I have a hard time imaging that the check whether there is
metadata is that expensive.
I guess a more interesting case would be comparing O_DIRECT|O_SYNC with
O_DIRECT + fdatasync() or even O_DIRECT +
sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE |
SYNC_FILE_RANGE_WAIT_AFTER)
Any special reason youve did that comparison on ext3? Especially with
data=ordered its behaviour regarding syncs is pretty insane performancewise.
Ext4 would be a bit more interesting...
Andres