Re: HOT patch - version 14 - Mailing list pgsql-patches
From | Tom Lane |
---|---|
Subject | Re: HOT patch - version 14 |
Date | |
Msg-id | 1314.1188486999@sss.pgh.pa.us Whole thread Raw |
In response to | Re: HOT patch - version 14 ("Pavan Deolasee" <pavan.deolasee@gmail.com>) |
Responses |
Re: HOT patch - version 14
|
List | pgsql-patches |
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > You are right - a new index might mean that an existing HOT chain > is broken as far as the new index is concerned. The way we address > that is by indexing the root tuple of the chain, but the index key is > extracted from the last tuple in the chain. The index is marked "unusable" > for all those existing transactions which can potentially see any > intermediate tuples in the chain. I don't think that works --- what if the last tuple in the chain isn't committed good yet? If its inserter ultimately rolls back, you've indexed the wrong value. > Please see this document written by Greg Stark. He has nicely summarized > how CREATE INDEX and CREATE INDEX CONCURRENTLY works with HOT. > http://archives.postgresql.org/pgsql-patches/2007-07/msg00360.php Isn't the extra machination for C.I.C. just useless complication? What's the point of avoiding creation of new broken HOT chains when you still have to deal with existing ones? >> I also don't think I >> believe the reasoning for not indexing DELETE_IN_PROGRESS hot-updated >> tuples: what if the index completion commits, but the concurrent delete >> rolls back? Then you've got a valid tuple that's not in the index. > Since CREATE INDEX works with ShareLock on the relation, we can > safely assume that we can't find any DELETE_IN_PROGRESS tuples except > those deleted by our own transaction. The only exception is system relation, > but we don't do HOT updates on system relation. That chain of reasoning is way too shaky for me. Essentially what you're saying is you'll promise not to corrupt non-system indexes. Nor am I really thrilled about having to disable HOT for system catalogs. > The only reason to redefine MaxHeapTuplesPerPage to higher side is > because HOT allows us to accommodate more tuples per page because > of redirect-dead line pointers. No, it doesn't allow more tuples per page. It might mean there can be more line pointers than that on a page, but not more actual tuples. The question is whether there is any real use in allowing more line pointers than that --- the limit is already unrealistically high, since it assumes no data content in any of the tuples. If there is a rationale for it then you should invent a different constant MaxLinePointersPerPage or some such, but I really think there's no point. > Doubling the value is chosen as a balance between heap page > utilization, line pointer bloating and overhead for bitmap scans. I'd say it allows a ridiculous amount of line pointer bloat. >> Even if it's safe, ISTM what you're mostly accomplishing there is to >> expend a lot of cycles while holding exclusive lock on the page, when >> there is good reason to think that you're blocking other people who are >> interested in using the page. Eliminating the separation between that >> and cleanup would also allow eliminating the separate "PD_FRAGMENTED" >> page state. > The reason we did it that way because repairing fragmentation seems > much more costly that pruning. Please note that we prune a single > chain during index fetch. Its only for heap-scans (and VACUUM) that > we try to prune all chains in the page. So unless we are doing heap-scan, > I am not sure if we are spending too much time holding the exclusive > lock. I agree we don't have any specific numbers to prove that though. If you don't have numbers proving that this extra complication has a benefit, I'd vote for simplifying it. The SnapshotAny case is going to bite other people besides you in future. > Another reasoning behind separating these two steps is: pruning > requires exclusive lock whereas repairing fragmentation requires > cleanup lock. This is nonsense. Those are the same lock. If you have the former and not the latter, it just means that you *know* there is contention for the page. It seems to me that performing optional maintenance work in such a situation is completely wrong. regards, tom lane
pgsql-patches by date: