Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Heap WARM Tuples - Design Draft
Date
Msg-id CABOikdNUop6sA5M2R3TmOwr7_uH7E1hbEwqeF4tvxUEvYKPtTA@mail.gmail.com
Whole thread Raw
In response to Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers


On Sat, Aug 6, 2016 at 8:34 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Aug  5, 2016 at 09:40:35PM -0400, Bruce Momjian wrote:
> So to summarize again:
>
> o  chains start out as HOT
> o  they become WARM when some indexes change and others don't
> o  for multiple index changes, we have to check all indexes
>    for key/ctid matches
> o  for single index changes, we can fail HOT and create a new
>    non-HOT/WARM tuple if there are too many index matches
> o  99% of index checks will not find a key/ctid match

I think a WARM chain where the the head ctid isn't LP_REDIRECT wasn't
pruned, so here is an updated list:

o  chains start out as HOT
o  they become WARM when some indexes change and others don't
o  if WARM chain head is not LP_REDIRECT, check existing chain for key
   matches
o  if WARM chain head is LP_REDIRECT:
        o  for single index changes, we can fail HOT and create a new
           non-HOT/WARM tuple if there are too many index matches
        o  for multiple index changes, we have to check all indexes
           for key/ctid matches
        o  99% of index checks will not find a key/ctid match

So, we are only checking the index if the WARM chain was pruned, and we
can bail out if there is only one index changed.  This is looking more
doable.

The duplicate tuples problem that we are focusing on, happens when an index already has two or index tuples pointing to the same root tuple/lp. When it's time to insert third index tuple, we must not insert a duplicate (key, CTID) tuple. I've a design where we can track which columns (we are interested only in the columns on which indexes use) were ever changed in the WARM chain. We allow one change for every index column, but the second change will require a duplicate lookup. This is still quite an improvement, the cold updates may reduce by at least more than 50% already, but someone can argue that this does not handle the case where same index column is repeatedly updated.

If we need to find an efficient way to convert WARM chains back to HOT, which will happen soon when the old index tuple retires, the system can attain a stable state, not for all but many use cases.

Thanks,
Pavan

-- 
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: "Some tests to cover hash_index"
Next
From: Amit Kapila
Date:
Subject: Re: Surprising behaviour of \set AUTOCOMMIT ON