Home > mailing lists

Re: HOT patch - version 14 - Mailing list pgsql-patches

From	Tom Lane
Subject	Re: HOT patch - version 14
Date	August 30, 2007 15:16:51
Msg-id	1314.1188486999@sss.pgh.pa.us Whole thread Raw
In response to	Re: HOT patch - version 14 ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Responses	Re: HOT patch - version 14
List	pgsql-patches

Tree view

"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> You are right - a new index might mean that an existing HOT chain
> is broken as far as the new index is concerned. The way we address
> that is by indexing the root tuple of the chain, but the index key is
> extracted from the last tuple in the chain. The index is marked "unusable"
> for all those existing transactions which can potentially see any
> intermediate tuples in the chain.

I don't think that works --- what if the last tuple in the chain isn't
committed good yet?  If its inserter ultimately rolls back, you've
indexed the wrong value.

> Please see this document written by Greg Stark. He has nicely summarized
> how CREATE INDEX and CREATE INDEX CONCURRENTLY works with HOT.
> http://archives.postgresql.org/pgsql-patches/2007-07/msg00360.php

Isn't the extra machination for C.I.C. just useless complication?
What's the point of avoiding creation of new broken HOT chains when
you still have to deal with existing ones?

>> I also don't think I
>> believe the reasoning for not indexing DELETE_IN_PROGRESS hot-updated
>> tuples: what if the index completion commits, but the concurrent delete
>> rolls back?  Then you've got a valid tuple that's not in the index.

> Since CREATE INDEX works with ShareLock on the relation, we can
> safely assume that we can't find any DELETE_IN_PROGRESS tuples except
> those deleted by our own transaction. The only exception is system relation,
> but we don't do HOT updates on system relation.

That chain of reasoning is way too shaky for me.  Essentially what
you're saying is you'll promise not to corrupt non-system indexes.
Nor am I really thrilled about having to disable HOT for system
catalogs.

> The only reason to redefine MaxHeapTuplesPerPage to higher side is
> because HOT allows us to accommodate more tuples per page because
> of redirect-dead line pointers.

No, it doesn't allow more tuples per page.  It might mean there can be
more line pointers than that on a page, but not more actual tuples.
The question is whether there is any real use in allowing more line
pointers than that --- the limit is already unrealistically high,
since it assumes no data content in any of the tuples.  If there is a
rationale for it then you should invent a different constant
MaxLinePointersPerPage or some such, but I really think there's no
point.

> Doubling the value is chosen as a balance between heap page
> utilization, line pointer bloating and overhead for bitmap scans.

I'd say it allows a ridiculous amount of line pointer bloat.

>> Even if it's safe, ISTM what you're mostly accomplishing there is to
>> expend a lot of cycles while holding exclusive lock on the page, when
>> there is good reason to think that you're blocking other people who are
>> interested in using the page.  Eliminating the separation between that
>> and cleanup would also allow eliminating the separate "PD_FRAGMENTED"
>> page state.

> The reason we did it that way because repairing fragmentation seems
> much more costly that pruning. Please note that we prune a single
> chain during index fetch. Its only for heap-scans (and VACUUM) that
> we try to prune all chains in the page. So unless we are doing heap-scan,
> I am not sure if we are spending too much time holding the exclusive
> lock. I agree we don't have any specific numbers to prove that though.

If you don't have numbers proving that this extra complication has a
benefit, I'd vote for simplifying it.  The SnapshotAny case is going to
bite other people besides you in future.

> Another reasoning behind  separating these two steps is:  pruning
> requires exclusive lock whereas repairing fragmentation requires
> cleanup lock.

This is nonsense.  Those are the same lock.  If you have the former and
not the latter, it just means that you *know* there is contention for
the page.  It seems to me that performing optional maintenance work in
such a situation is completely wrong.

            regards, tom lane

pgsql-patches by date:

From: Gregory Stark
Date: 30 August 2007, 10:20:25
Subject: Re: HOT patch - version 14

From: "Gavin M. Roy"
Date: 30 August 2007, 15:39:40
Subject: Re: pg_dump --no-tablespaces patch

Re: HOT patch - version 14 - Mailing list pgsql-patches

Previous

Next