Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
Date
Msg-id 20110628194039.GB10430@tornado.leadboat.com
Whole thread Raw
In response to Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
List pgsql-hackers
On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
> On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
> >> On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch <noah@leadboat.com> wrote:
> >> > [patch to avoid index rebuilds]
> >>
> >> With respect to the documentation hunks, it seems to me that the first
> >> hunk might be made clearer by leaving the paragraph of which it is a
> >> part as-is, and adding another paragraph afterwards beginning with the
> >> words "In addition".
> >
> > The added restriction elaborates on the transitivity requirement, so I wanted to
> > keep the new language adjacent to that.
> 
> That makes sense, but it reads a bit awkwardly to me.  Maybe it's just
> that the sentence itself is so complicated that I have difficulty
> understanding it.  I guess it's the same problem as with the text you
> added about hash indexes: without thinking about it, it's hard to
> understand what territory is covered by the new sentence that is not
> covered by what's already there.  In the case of the hash indexes, I
> think I have it figured out: we already know that we must get
> compatible hash values out of logically equal values of different
> datatypes.  But it's possible that the inter-type cast changes the
> value in some arbitrary way and then compensates for it by defining
> the hash function in such a way as to compensate.  Similarly, for
> btrees, we need the relative ordering of A and B to remain the same
> after casting within the opfamily, not to be rearranged somehow.
> Maybe the text you've got is OK to explain this, but I wonder if
> there's a way to do it more simply.

That's basically right, I think.  Presently, there is no connection between
casts and operator family notions of equality.  For example, a cast can change
the hash value.  In general, that's not wrong.  However, I wish to forbid it
when some hash operator family covers both the source and destination types of
the cast.  Note that, I don't especially care whether the stored bits changed or
not.  I just want casts to preserve equality when an operator family defines
equality across the types involved in the cast.  The specific way of
articulating that is probably vulnerable to improvement.

> > It would be valuable to avoid introducing a second chunk of code that knows
> > everything about the catalog entries behind an index. ?That's what led me to the
> > put forward the most recent version as best. ?What do you find vile about that
> > approach? ?I wasn't comfortable with it at first, because I suspected the checks
> > in RelationPreserveStorage() might be important for correctness. ?Having studied
> > it some more, though, I think they just reflect the narrower needs of its
> > current sole user.
> 
> Maybe vile is a strong word, but it seems like a modularity violation:
> we're basically letting DefineIndex() do some stuff we don't really
> want done, and then backing it out parts of it that we don't really
> want done after all.  It seems like it would be better to provide
> DefineIndex with enough information not to do the wrong thing in the
> first place.  Could we maybe pass stmt->oldNode to DefineIndex(), and
> let it work out the details?

True.  I initially shied away from that, because we assume somewhat deep into
the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
Here's the call stack in question:
RelationBuildLocalRelationheap_createindex_createDefineIndexATExecAddIndex

Looking at it again, it wouldn't bring the end of the world to add a relfilenode
argument to each.  None of those have more than four callers.  ATExecAddIndex()
would then call RelationPreserveStorage() before calling DefineIndex(), which
would in turn put things in a correct state from the start.  Does that seem
appropriate?  Offhand, I do like it better than what I had.

Thanks,
nm



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: marking old branches as no longer maintained
Next
From: Dave Page
Date:
Subject: Re: marking old branches as no longer maintained