Re: Checking for missing heap/index files - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Checking for missing heap/index files
Date
Msg-id CAOuzzgp3CQJvWugOB2txYsdUOmgWNrL4GoNNZG1nL6Ytbu9h-Q@mail.gmail.com
Whole thread Raw
In response to Re: Checking for missing heap/index files  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Checking for missing heap/index files
List pgsql-hackers
Greetings,

On Fri, Jun 17, 2022 at 14:32 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jun-09, Stephen Frost wrote:

> TL;DR: if you're removing files from a directory that you've got an
> active readdir() running through, you might not actually get all of the
> *existing* files.  Given that PG is happy to remove files from PGDATA
> while a backup is running, in theory this could lead to a backup utility
> like pgbackrest or pg_basebackup not actually backing up all the files.
>
> Now, pgbackrest runs the readdir() very quickly to build a manifest of
> all of the files to backup, minimizing the window for this to possibly
> happen, but pg_basebackup keeps a readdir() open during the entire
> backup, making this more possible.

Hmm, this sounds pretty bad, and I agree that a workaround should be put
in place.  But where is pg_basebackup looping around readdir()?  I
couldn't find it.  There's a call to readdir() in FindStreamingStart(),
but that doesn't seem to match what you describe.

It’s the server side that does it in basebackup.c when it’s building the tarball for the data dir and each table space and sending it to the client. It’s not done by src/bin/pg_basebackup. Sorry for not being clear. Technically this would be beyond just pg_basebackup but would impact, potentially, anything using BASE_BACKUP from the replication protocol (in addition to other backup tools which operate against the data directory with readdir, of course).

Thanks,

Stephen

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: should check interrupts in BuildRelationExtStatistics ?
Next
From: Dong Wook Lee
Date:
Subject: Re: Add TAP test for auth_delay extension