Re: ALTER TYPE 1: recheck index-based constraints - Mailing list pgsql-hackers

From Noah Misch
Subject Re: ALTER TYPE 1: recheck index-based constraints
Date
Msg-id 20110120055753.GA13329@tornado.leadboat.com
Whole thread Raw
In response to Re: ALTER TYPE 1: recheck index-based constraints  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: ALTER TYPE 1: recheck index-based constraints  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Jan 19, 2011 at 11:50:12PM -0500, Robert Haas wrote:
> On Wed, Jan 12, 2011 at 8:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch <noah@leadboat.com> wrote:
> >> Something like the attached?
> >
> > I haven't really analyzed in this detail, but yes, approximately
> > something sorta like that.
> 
> [assessment of current uses] So I think the logic is correct and not overly
> complex.

Sounds correct to me.

> I think what might make sense is - instead of adding another Boolean
> argument, change the heap_rebuilt argument to int flags, and define
> REINDEX_CHECK_CONSTRAINTS and REINDEX_SUPPRESS_INDEX_USE as OR-able
> flags.  I think that's more clear about what's actually going on than
> heap_rebuit/tuples_changed.

There are two distinct questions here:

(1) Should reindex_relation receive boolean facts from its callers by way of two
boolean arguments, or by way of one flags vector?

The former seems best when you want every caller to definitely think about which
answer is right, and the latter when that's not so necessary.  (For a very long
list of options, the flags might be chosen on other grounds.)  As framed, I'd
lean toward keeping distinct arguments, as these are important questions.

However, suppose we inverted both flags, say REINDEX_SKIP_CONSTRAINT_CHECKS and
REINDEX_ALLOW_OLD_INDEX_USE.  Then, flags = 0 can hurt performance but not
correctness.  That's looking like a win.


(2) Should reindex_relation frame its boolean arguments in terms of what the
caller did (heap_rebuilt, tuples_changed), or in terms of what reindex_relation
will be doing (check_constraints, suppress_index_use)?

The former should be the default approach, but it requires that we be able to
frame good names that effectively convey an abstraction.  Prospective callers
must know what to send just by looking at the name and reading the header
comment.  When no prospective name is that expressive and you'll end up reading
the reindex_relation code to see how they play out, then it's better to go with
the latter instead.  A bad abstraction is worse than none at all.

I agree that both heap_rebuilt and tuples_changed are bad abstractions.
TRUNCATE is lying about the former, and the latter, as you say, is never really
correct.  column_values_changed, perhaps.  tuples_could_now_violate_constraints
would be correct, but that's just a bad spelling for REINDEX_CHECK_CONSTRAINTS.
I guess the equivalent long-winded improvement for heap_rebuilt would be
indexes_still_valid_for_snapshotnow.  Eh.

So yes, let's adopt callee-action-focused names like you propose.


Overall, I'd vote for a flags parameter with negative senses like I noted above.
My second preference would be for two boolean parameters, check_constraints and
suppress_index_use.  Not really a big deal to me, though.  (I feel a bit silly
writing this email.)  What do you think?

Thanks,
nm


pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: sepgsql contrib module
Next
From: "Simone Aiken"
Date:
Subject: Re: ToDo List Item - System Table Index Clustering