Re: btreecheck extension - Mailing list pgsql-hackers

From Robert Haas
Subject Re: btreecheck extension
Date
Msg-id CA+TgmoYsGfLCTTWx7+LA825fad_M69t1eigvkq-vgkEzA1piCw@mail.gmail.com
Whole thread Raw
In response to Re: btreecheck extension  (Peter Geoghegan <pg@heroku.com>)
Responses Re: btreecheck extension  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Tue, Jun 17, 2014 at 5:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, Jun 17, 2014 at 1:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I don't feel qualified to comment on any of the substantive issues you
>> raise, so instead I'd like to bikeshed the name.  I suggest that we
>> create one extension to be a repository for index-checking machinery
>> (and perhaps also heap-checking machinery) and view this as the first
>> of possibly several checkers to live there.  Otherwise, we may
>> eventually end up with separate extensions for btreecheck, hashcheck,
>> gistcheck, gincheck, spgistcheck, minmaxcheck, vodkacheck, heapcheck,
>> toastcheck, etc. which seems like excessive namespace pollution.
>
> I agree.
>
> I hope that we'll eventually be able to move code like this into each
> AM, with something like an amverifyintegrity pg_am entry optionally
> provided. There'd also be a utility statement that would perform this
> kind of verification. It seems natural to do this, as the patch I've
> posted arguably adds a big modularity violation. Besides, it seems
> worthwhile to pepper the regular regression tests with calls like
> these, at least in some places, and putting something in core is the
> most convenient way to do that.

I think there's something to be said for that, but I think at the
moment I like the idea of a functional interface better.  The reason
is that I'm not sure we can predict all of the checks we're going to
want to add.  For example, maybe somebody will come up with another
btree checker that's different from your btree checker, and maybe
there will be good reasons not to merge the two - e.g. different
locking requirements, or different runtimes, or whatever.  Even more
likely, I think there will be things we want to do that fall under the
broad umbrella of integrity checking that we just can't predict now:
scan the table and check whether all the keys are present in every
index, scan the TOAST table and make sure that all the chunks are the
right size, etc.  If we have a bunch of functions for this sort of
thing, it's easy and relatively uncontroversial to add more.  If we
put it in core and give it real grammar support, then we've got to
fight with keywords and contemplate grammar bloat and, basically, I
think every change will be likely to get a higher level of scrutiny.
I'd rather not go there.

Now, we could.  We could come up with an extensible syntax, like this:

CHECK relation [ USING { checktype [ '(' arg [, ...] '}' [, ...] ];

But frankly I'm kind of uncompelled by that.  This isn't a feature
that seems to me to really need to be in core.  It doesn't
particularly need grammar support, and it doesn't need WAL logging,
and not everyone needs it at all, and especially if we eventually end
up with a robust suite of tools in this area, not everyone may even
want it, if it means a bigger installed footprint or more security
concerns to worry about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Minmax indexes
Next
From: Andrew Dunstan
Date:
Subject: Re: comparison operators