From: Andres Freund [mailto:andres@anarazel.de]
> Indeed. My past experience with open_datasync on linux shows it to be slower
> by roughly an order of magnitude. Even if that would turn out not to be
> the case anymore, I'm *extremely* hesitant to make such a change.
Thanks for giving so quick feedback. An order of magnitude is surprising. Can you share the environment (Linux distro
version,kernel version, filesystem, mount options, workload, etc.)? Do you think of anything that explains the
degradation? I think it is reasonable that open_datasync is faster than fdatasync because:
* Short transactions like pgbench require less system calls: write()+fdatasync() vs write().
* fdatasync() probably has to scan the page cache for dirty pages.
The above differences should be invisible on slow disks, but they will show up on faster storage. I guess that's why
Robertsaid open_datasync was much faster on NVRAM.
The manual says that pg_test_fsync is a tool for selecting wal_sync_method value, and it indicates open_datasync is
better. Why is fdatasync the default value only on Linux? I don't understand as a user why PostgreSQL does the special
handling. If the current behavior of choosing fdatasync by default is due to some deficiency of old kernel and/or
filesystem,I think we can change the default so that most users don't have to change wal_sync_method.
Regards
Takayuki Tsunakawa