Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers
From | Claudio Freire |
---|---|
Subject | Re: Heap WARM Tuples - Design Draft |
Date | |
Msg-id | CAGTBQpYpu463mQByEp5MLqcdqzOHN1qOBCDN6BRb+ehMF+b=UA@mail.gmail.com Whole thread Raw |
In response to | Heap WARM Tuples - Design Draft (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Responses |
Re: Heap WARM Tuples - Design Draft
|
List | pgsql-hackers |
On Thu, Aug 4, 2016 at 7:59 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > We don’t want to force a recheck for every index fetch because that will > slow everything down. So we would like a simple and efficient way of knowing > about the existence of a WARM tuple in a chain and do a recheck in only > those cases, ideally O(1). Having a HOT chain contain a WARM tuple is > discussed below as being a “WARM chain”, implying it needs re-check. > > We could solve this by either 1) marking both old and new index tuples as > "recheck-required" or 2) marking the HOT chain on the heap as > "recheck-required" such that any tuple returned from the chain requires a > recheck. > > The advantage of doing 1) is that the recheck is required only for the > compromised index. > Note that in presence of multiple index keys pointing to the same root line > pointer, the chain needs re-check for both the old as well as the new index > key. The old index key must not fetch the new tuple(s) and the new index key > must not fetch the old tuple(s). So irrespective of whether an old or a new > tuple is being returned, the caller must know about the WARM tuples in the > chains. So marking the index would require us to mark both old and new index > tuples as requiring re-check. That requires an additional index scan to > locate the old row and then an additional write to force it to re-check, > which is algorithmically O(NlogN) on table size. > > Marking the heap, (2) is O(1) per updated index, so seems better. But since > we don't know with respect to which index or indexes the chain is broken, > all index scans must do a recheck for a tuple returned from a WARM chain. > > How do we mark a chain as WARM? How do we clear the flag? There are a few > possible approaches: > > 1. Mark the first WARM tuple which causes HOT chain to become WARM with a > special HEAP_RECHECK_REQUIRED flag (this flag must then be carried over to > any further additions to the chain. Then whenever a tuple if returned from a > chain, we must look forward into the chain for WARM tuple and let the caller > know about potential breakage. This seems simple but would require us to > follow HOT chain every time to check if it’s HOT or WARM. > > 2. Mark the root line pointer (or the root tuple) with a special > HEAP_RECHECK_REQUIRED flag to tell us about the presence of a WARM tuple in > the chain. Since all indexes point to the root line pointer, it should be > enough to just mark the root line pointer (or tuple) with this flag. > > 3. We could also adopt a hybrid approach and mark the new index tuple and > new heap tuple with HEAP_RECHECK_REQUIRED flag and presence of either of the > flag in either tuple forces recheck. I've actually been thinking of another possibility in the context of LITE, and what I was pondering looks eerily similar to WARM, so let me summarize (I was going to write a lengthier POC but I thought I better comment now since you're introducing the same concept): Marks are troublesome becuase, as you mention, you would need one "WARM" bit per index to be efficient. We also don't want a recheck on every index access. HOT is very close to WARM already, even the code. The only problem with HOT is that it stops scanning the update chain when a tuple is not HOT-updated. If heap_hot_search_buffer knew to keep on following the update chain when there is a WARM update on the index in question, this could work. But it won't work if it's implemented naively, because if there was another item pointer in another page of the index pointing to a heap tuple that belongs to an already-visited chain, which is possible in the below case, then the scan would produce duplicate rows: N: Row versions: ->: update * : causes a new index tuple on index I 1 -> 2 -> 3 -> 4 -> 5 * * * a b a <- key If the scan qual is "key in ('a','b')", then the index scan will visit both versions of the row, 1, 3 and 4. Suppose only 5 is visible, and suppose the whole chain lies in the same page. There is one update chain, 1..5, but three WARM chains: 1..2, 3, 4..5 When visiting item pointers, 1 will be found, the chain will be followed and eventually version 5 will pass the visibility check and the tuple will pass the scan quals, it will even match the key in the index item, so no key check will fix this. 5 will be yielded when visiting version 1. Next, version 3 will be visited. There's no easy way to know that we already visited 3, so we'll follow the chain all the way to 5, and, again, it will be yielded. Here, the key won't match, so maybe this case could be filtered out with a key check. Finally, version 4 will be visited, the chain will be followed to 5, and as in the first case, it will be yielded. So the scan yielded the same heap tuple thrice, which isn't an ideal situation. It would be totally acceptable for a bitmap index scan, so the expensive logic I will be outlining below may be skipped when building a bitmap (lossy or not). To fix this, what I was pondering, and I believe it would be viable, is to teach heap_hot_search_buffer about the WARM invariant. It will make correctness heavily depend on the invariant being true, which means either flagging item pointers as WARM chains, or requiring a reindex during upgrade to make sure all indexes verify the invariant - as old indices won't. When an item pointer is WARM (either implicit or explicit), heap_hot_search_buffer will only stop following the update chain if it reaches not the end of a HOT chain, but rather a key change. For this, it will need to recheck the key of all but the first TID in the chain, so singleton chains (ie: not-updated or frozen tuples) won't incurr this extra cost. When walking the chain, it will need to form an index tuple for both the old and new versions, and compare them, and follow the chain *only* if the keys are equal. As an optimization, if the tuple is marked HOT-updated, the key check may be skipped. Vacuum will need to be taught not to break the invariant, but I believe this is viable and efficient.
pgsql-hackers by date: