On Wed, Oct 14, 2020 at 02:48:18PM +1300, Thomas Munro wrote: > On Wed, Oct 14, 2020 at 12:53 AM Michael Banck > <michael.banck@credativ.de> wrote: >> One question about this: Did you consider the case of a basebackup being >> copied/restored somewhere and the restore/PITR being started? Shouldn't >> Postgres then sync the whole data directory first in order to assure >> durability, or do we consider this to be on the tool that does the >> copying? Or is this not needed somehow? > > To go with precise fsyncs, we'd have to say that it's the job of the > creator of the secondary copy. Unfortunately that's not a terribly > convenient thing to do (or at least the details vary).
Yeah, it is safer to assume that it is the responsability of the backup tool to ensure that because it could be possible that a host is unplugged just after taking a backup, and having Postgres do this work at the beginning of archive recovery would not help in most cases.
Let's document that assumption in the docs for pg_basebackup and the file system copy based replica creation docs. With a reference to initdb's datadir sync option.
IMO this comes back to the point where we usually should not care much how long a backup takes as long as it is done right. Users care much more about how long a restore takes until consistency is reached. And this is in line with things that have been done via bc34223b or 96a7128. -- Michael