Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers
From | Claudio Freire |
---|---|
Subject | Re: Heap WARM Tuples - Design Draft |
Date | |
Msg-id | CAGTBQpZ25ky5Yr-BgY3vb3mbc_mYTqryh+XagDgoW-wHfYFesg@mail.gmail.com Whole thread Raw |
In response to | Re: Heap WARM Tuples - Design Draft (Jim Nasby <Jim.Nasby@BlueTreble.com>) |
Responses |
Re: Heap WARM Tuples - Design Draft
Re: Heap WARM Tuples - Design Draft |
List | pgsql-hackers |
On Tue, Aug 9, 2016 at 8:19 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 8/8/16 3:19 PM, Bruce Momjian wrote: >>> >>> What will help, and something I haven't yet applied any thoughts, is when >>> we >>> > can turn WARM chains back to HOT by removing stale index entries. >> >> I can't see how we can ever do that because we have multiple indexes >> pointing to the chain, and keys that might be duplicated if we switched >> to HOT. Seems only VACUUM can fix that. > > > Are these changes still predicated on being able to re-find all index > entries by key value? If so, that makes incremental vacuums practical, > perhaps eliminating a lot of these issues. > >>>> > > We can't use the bits LP_REDIRECT lp_len because we need to create >>>> > > WARM >>>> > > chains before pruning, and I don't think walking the pre-pruned >>>> > > chain is >>>> > > worth it. (As I understand HOT, LP_REDIRECT is only created during >>>> > > pruning.) >>> >>> > >>> > That's correct. But lp_len provides us some place to stash information >>> > from >>> > heap tuples when they are pruned. >> >> Right. However, I see storing information at prune time as only useful >> if you are willing to scan the chain, and frankly, I have given up on >> chain scanning (with column comparisons) as being too expensive for >> its limited value. > > > What if some of this work happened asynchronously? I'm thinking something > that runs through shared_buffers in front of bgwriter. If one can find key-ctid pairs efficiently in the index, this can be done during WARM pruning (ie: during updates, when we're already doing the index lookups anyway). Suppose you have the following chain: index 0 1 2 3 4 k1 a a a a a k2 a a b a a i1 ^ i2 ^ ^ hot * * If versions 0-2 die, pruning can free 1 (it's HOT), and leave redirects in 0 and 2: index 0 1 2 3 4 k1 a . a a a k2 a . b a a i1 ^ i2 ^ ^ lp r2 u r3 hot * Since we can lookup all occurrences of k1=a index=0 and k2=a index=0, and in fact we probably did so already as part of the update logic, we can remove the first redirect by pointing indexes to 2: index 0 1 2 3 4 k1 . . a a a k2 . . b a a i1 ^ i2 ^ ^ lp u u r3 hot * So WARM pruning would have to happen just prior to adding a WARM tuple to be able to reuse all the index lookup work to do the index pruning with the writing. The indexam interface will get considerably more complex in order to do this, however. So perhaps it would be ok to do 2 independent lookups instead? I'm undecided yet. But one way or the other, pruning can happen early and incrementally.
pgsql-hackers by date: