Re: Add system identifier to backup manifest - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Add system identifier to backup manifest
Date
Msg-id CA+Tgmoa8okr1YCFcCOGgvOTASiXsJgx-oqrCS41Wir=oC8c-mA@mail.gmail.com
Whole thread Raw
In response to Re: Add system identifier to backup manifest  (Amul Sul <sulamul@gmail.com>)
Responses Re: Add system identifier to backup manifest
List pgsql-hackers
On Thu, Feb 1, 2024 at 2:18 AM Amul Sul <sulamul@gmail.com> wrote:
> I intended to minimize the out param of parse_manifest_file(), which currently
> returns manifest_files_hash and manifest_wal_range, and I need two more --
> manifest versions and the system identifier.

Sure, but you could do context.ht = manifest_data->files instead of
context.manifest = manifest_data. The question isn't whether you
should return the whole manifest_data from parse_manifest_file -- I
agree with that decision -- but rather whether you should feed the
whole thing through into the context, or just the file hash.

> Yeah, we can do that, but I think it is a bit inefficient to have strcmp()
> check for the pg_control file on each verify_backup_file() call, despite, we
> know that path.  Also, I think, we need additional handling to ensure that the
> system identifier has been verified in verify_backup_file(), what if the
> pg_control file itself missing from the backup -- might be a rare case, but
> possible.
>
> For now, we can do the system identifier validation after
> verify_backup_directory().

Yes, that's another option, but I don't think it's as good.

Suppose you do it that way. Then what will happen when the file is
altogether missing or inaccessible? I think verify_backup_file() will
complain, and then you'll have to do something ugly to avoid having
verify_system_identifier() emit the same complaint all over again.
Remember, unless you find some complicated way of passing data around,
it won't know whether verify_backup_file() emitted a warning or not --
it can redo the stat() and see what happens, but it's not absolutely
guaranteed to be the same as what happened before. Making sure that
you always emit any given complaint once rather than twice or zero
times is going to be tricky.

It seems more natural to me to just accept the (small) cost of a
strcmp() in verify_backup_file(). If the initial stat() fails, it
emits whatever complaint is appropriate and returns and the logic to
check the system identifier is never reached. If it succeeds, you can
proceed to try to open the file and do what you need to do.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Where can I find the doxyfile?
Next
From: vignesh C
Date:
Subject: Re: pgbench - adding pl/pgsql versions of tests