> TL;DR: if you're removing files from a directory that you've got an > active readdir() running through, you might not actually get all of the > *existing* files. Given that PG is happy to remove files from PGDATA > while a backup is running, in theory this could lead to a backup utility > like pgbackrest or pg_basebackup not actually backing up all the files. > > Now, pgbackrest runs the readdir() very quickly to build a manifest of > all of the files to backup, minimizing the window for this to possibly > happen, but pg_basebackup keeps a readdir() open during the entire > backup, making this more possible.
Hmm, this sounds pretty bad, and I agree that a workaround should be put in place. But where is pg_basebackup looping around readdir()? I couldn't find it. There's a call to readdir() in FindStreamingStart(), but that doesn't seem to match what you describe.
It’s the server side that does it in basebackup.c when it’s building the tarball for the data dir and each table space and sending it to the client. It’s not done by src/bin/pg_basebackup. Sorry for not being clear. Technically this would be beyond just pg_basebackup but would impact, potentially, anything using BASE_BACKUP from the replication protocol (in addition to other backup tools which operate against the data directory with readdir, of course).