On Sat, Jul 27, 2019 at 2:27 PM Andres Freund <andres@anarazel.de> wrote:
> Note that neither of those mean that it's not a good idea to
> posix_fallocate() and *then* write zeroes, when initializing. For
> several filesystems that's more likely to result in more optimally sized
> filesystem extents, reducing fragmentation. And without an intervening
> f[data]sync, there's not much additional metadata journalling. Although
> that's less of an issue on some newer filesystems, IIRC (due to delayed
> allocation).
Interesting. One way to bring back posix_fallocate() without
upsetting people on some filesystem out there would be to turn the new
wal_init_zero GUC into a choice: write (current default, and current
behaviour for 'on'), pwrite_hole (write just the final byte, current
behaviour for 'off'), posix_fallocate (like that 2013 patch that was
reverted) and posix_fallocate_and_write (do both as you said, to try
to solve that problem you mentioned that led to the revert).
I suppose there'd be a parallel GUC undo_init_zero. Or some more
general GUC for any fixed-sized preallocated files like that (for
example if someone were to decide to do the same for SLRU files
instead of growing them block-by-block), called something like
file_init_zero.
--
Thomas Munro
https://enterprisedb.com