Re: pgsql: Add TAP tests for pg_verify_checksums - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: pgsql: Add TAP tests for pg_verify_checksums
Date
Msg-id 20181019213257.GS4184@tamriel.snowman.net
Whole thread Raw
In response to Re: pgsql: Add TAP tests for pg_verify_checksums  (Andres Freund <andres@anarazel.de>)
Responses Re: pgsql: Add TAP tests for pg_verify_checksums  (Andres Freund <andres@anarazel.de>)
Re: pgsql: Add TAP tests for pg_verify_checksums  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2018-10-19 16:35:46 -0400, Stephen Frost wrote:
> > The only justification for *not* doing this is that some extension
> > author might have dumped files into our tablespace directory, something
> > we've never claimed to support nor generally worried about in all the
> > time that I can recall before this.
>
> No, it's really not the only reason. As I said, testing is much less
> likely to find cases where we're checksumming a short-lived file even
> though we shouldn't, than making sure that checksummed files are
> actually checksummed.

While I agree that we might possibly end up trying to checksum a
short-lived and poorly-named file, I'd rather that than risk missing the
checking of a file which does have checksums that should have been
checked and claiming that we checked all of the checksums.

> > > > What about two different extensions wanting to create files with the
> > > > same names in the tablespace directories..?
> > > >
> > > > Experimentation is fine, of course, this is open source code and people
> > > > should feel free to play with it, but we are not obligated to avoid
> > > > breaking things when an extension author, through their experimentation,
> > > > does something which is clearly not supported, like dropping files into
> > > > PG's tablespace directories.  Further, when it comes to our user's data,
> > > > we should be taking a strict approach and accounting for everything,
> > > > something that this whitelist-of-patterns-based approach to finding
> > > > files to verify the checksums on doesn't do.
> > >
> > > It's not economically feasible to work on extensions that will only be
> > > usable a year down the road.
> >
> > I listed out multiple other solutions to this problem which you
> > summarily ignored.
>
> Except that none of them really achieves the goals you can achieve by
> having the files in the database (like DROP DATABASE working, for
> starters).

The only reason things like DROP DATABASE "work" in this example case is
exactly that it assumes that all of the files which exist are ones which
we put there.  Or, to put it another way, if we're only going to
checksum files based on some whitelist of files we expect to be there,
shouldn't we go back and make things like DROP DATABASE only remove
those files that we expect to be there and not random other files that
we have no knowledge of..?

> > It's unfortunate that those other solutions weren't used and that,
> > instead, this extension decided to drop files into PG's tablespace
> > directories, but that doesn't mean we should condone or implicitly
> > support that action.
>
> This just seems pointlessly rigid. Our extension related APIs aren't
> generally very good or well documented. Most non-trivial extensions that
> add interesting things to the postgres ecosystem are going to need a few
> not so pretty things to get going.

PostGIS is a fantastic example of an extension that is far from trivial,
extends PG in ways which are clearly supported, and has worked with PG
to improve things in core to be able to then use in the extension.

> > I stand by my position that this patch should be reverted (with no
> > offense or ill-will towards Michael, of course, I certainly appreciate
> > his efforts to address the issues with pg_verify_checksums) and that we
> > should move more of this code to identify files to verify the checksums
> > on into libpgcommon and then use it from both pg_basebackup and
> > pg_verify_checksums, to the extent possible, but that we make sure to
> > account for all of the files in our tablespace and database directories.
>
> What does that do, except break things that currently work? Sure, work
> on a patch that fixes the architectural concerns, but what's the point
> in reverting it until that's done?

As you pointed out previously, the current code *doesn't* work, before
or after this change, and we clearly need to rework this to move things
into libpgcommon and also fix pg_basebackup.  Reverting this would at
least get us back to having similar code between this and pg_basebackup,
and then it'll be cleaner and clearer to have one patch which moves that
similar logic into libpgcommon and fixes the missing exceptions for the
EXEC_BACKEND case.

Keeping the patch doesn't do anything for the pg_basebackup case, and
confuses the issue by having these filename-pattern-whitelists which
weren't there before and that should be removed, imv.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums
Next
From: Andres Freund
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums