Re: pgsql: Add TAP tests for pg_verify_checksums - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pgsql: Add TAP tests for pg_verify_checksums
Date
Msg-id 20181020033955.GA2553@paquier.xyz
Whole thread Raw
In response to Re: pgsql: Add TAP tests for pg_verify_checksums  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: Add TAP tests for pg_verify_checksums  (Andres Freund <andres@anarazel.de>)
Re: pgsql: Add TAP tests for pg_verify_checksums  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Fri, Oct 19, 2018 at 06:43:18PM -0400, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I don't think just reverting it is really acceptable.
>
> +several.  I do not mind somebody writing and installing a better fix.
> I do object to turning the buildfarm red again.

I did not expect this thread to turn into a war zone.  Anyway, there are
a couple of things I agree with on this thread:
- I agree with Andres point here:
https://postgr.es/m/20181019171747.4uithw2sjkt6msne@alap3.anarazel.de
A blacklist is fundamentally more difficult to maintain as there are
way more things added in a data folder which do not have data checksums
than things which have checksums.  So using a blacklist approach looks
unmaintainable in the long term.  Future patches related to enabling
online checksum verification make me worry if we keep the code like
that.  I can also easily imagine that anybody willing to use the
pluggable storage API would like to put new files in tablespace-related
data folders, relying on "base/" being the default system tablespace
- I agree with Stephen's point that we should decide if a file has
checksums or not in a single place, and that we should use the same
logic for base backups and pg_verify_checksums.
- I agree with not doing a simple revert to not turn the buildfarm red
again.  This is annoying for animal maintainers.  Andrew has done a very
nice work in disabling manually those tests temporarily.
- The base backup logic deciding if a file has checksums looks broken to
me: it misses files generated by EXEC_BACKEND, and any instance of
Postgres using an extension with custom files and data checksums has its
backups broken.  cstore_fdw has been mentioned above, and I recall that
Heroku for example enables data checksums.  If you combine both, it
basically means that such an instance cannot take base backups anymore
while it was able to do so with pre-10 with default options.  That's not
cool.

So what I think we ought to do is the following:
- Start a new thread, this one about TAP tests is not adapted.
- Add in src/common/relpath.c the API from d55241af called
isRelFileName(), make use of it in the base backup code, and basically
remove is_checksummed_file() and the checksum skip list.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: vacuum fails with "could not open statistics file" "Device orresource busy"
Next
From: Andres Freund
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums