On Thu, Sep 30, 2021 at 12:49:36PM +0900, Michael Paquier wrote:
> On Wed, Sep 29, 2021 at 07:43:41PM -0500, Justin Pryzby wrote:
> > Forking this thread in which Thomas implemented syncfs for the startup process
> > (61752afb2).
> >
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BSG9jSW3ekwib0cSdC0yD-jReJ21X4bZAmqxoWTLTc2A%40mail.gmail.com
> >
> > Is there any reason that initdb/pg_basebackup/pg_checksums/pg_rewind shouldn't
> > use syncfs() ?
>
> That makes sense.
>
> > do_syncfs() is in src/backend/ so would need to be duplicated^Wimplemented in
> > common.
>
> The fd handling in the backend makes things tricky if trying to plug
> in a common interface, so I'd rather do that as this is frontend-only
> code.
>
> > They can't use the GUC, so need to add an cmdline option or look at an
> > environment variable.
>
> fsync_pgdata() is going to manipulate many inodes anyway, because
> that's a code path designed to do so. If we know that syncfs() is
> just going to be better, I'd rather just call it by default if
> available and not add new switches to all the frontend tools in need
> of flushing the data folder, switches that are not documented in your
> patch.
It is a draft/POC, after all.
The argument against using syncfs by default is that it could be worse than
recursive fsync if a tiny 200MB postgres instance lives on a shared filesystem
along with other, larger applications (maybe a larger postgres instance).
There's also an argument that syncfs might be unreliable in the case of a write
error. (But I agreed with Thomas' earlier assessment: that claim caries little
weight since fsync() itself wasn't reliable for 20some years).
I didn't pursue this patch, as it's easier for me to use /bin/sync -f. Someone
should adopt it if interested.
--
Justin