Martijn van Oosterhout <kleptog@svana.org> writes:
> My problem is that journalling works on a per-file basis. ie, the data for a
> file is written before that file's metadata. However, the fsync is used for
> the WAL segments and if you can't guarentee the WAL will hit the disk before
> the data segments (different files), you're stuffed I think.
> Or maybe WAL is not that sensitive to that kind of reordering. Maybe it only
> depends on the WAL being consistant.
The entire *point* of WAL is that WAL entries must hit disk before any
of the data-file changes they describe (that's why it's called write
AHEAD log). Without this you can't use WAL replay to ensure the data
files are brought to a fully consistent state. So yes, we do have to
have cross-file write ordering guarantees. fsync is a pretty blunt tool
for enforcing cross-file write ordering, but it's the only one
available...
regards, tom lane