On Thu, Jan 25, 2024 at 2:52 AM Amul Sul <sulamul@gmail.com> wrote:
> Thank you for the review-comments, updated version attached.
I generally agree with 0001. I spent a long time thinking about your
decision to make verifier_context contain a pointer to manifest_data
instead of, as it does currently, a pointer to manifest_files_hash. I
don't think that's a horrible idea, but it also doesn't seem to be
used anywhere currently. One advantage of the current approach is that
we know that none of the code downstream of verify_backup_directory()
or verify_backup_checksums() actually cares about anything other than
the manifest_files_hash. That's kind of nice. If we didn't change this
as you have done here, then we would need to continue passing the WAL
ranges to parse_required_walI() and the system identifier would have
to be passed explicitly to the code that checks the system identifier,
but that's not such a bad thing, either. It makes it clear which
functions are using which information.
But before you go change anything there, exactly when should 0002 be
checking the system identifier in the control file? What happens now
is that we first walk over the directory tree and make sure we have
the files (verify_backup_directory) and then go through and verify
checksums in a second pass (verify_backup_checksums). We do this
because it lets us report problems that can be detected cheaply --
like missing files -- relatively quickly, and problems that are more
expensive to detect -- like mismatching checksums -- only after we've
reported all the cheap-to-detect problems. At what stage should we
verify the control file? I don't really like verifying it first, as
you've done, because I think the error message change in
004_options.pl is a clear regression. When the whole directory is
missing, it's much more pleasant to complain about the directory being
missing than some file inside the directory being missing.
What I'd be inclined to suggest is that you have verify_backup_file()
notice when the file it's being asked to verify is the control file,
and have it check the system identifier at that stage. I think if you
do that, then the error message change in 004_options.pl goes away.
Now, to do that, you'd need to have the whole manifest_data available
from the context, not just the manifest_files_hash, so that you can
see the expected system identifier. And, interestingly, if you take
this approach, then it appears to me that 0001 is correct as-is and
doesn't need any changes.
Thoughts?
--
Robert Haas
EDB: http://www.enterprisedb.com