Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Heap WARM Tuples - Design Draft
Date
Msg-id CABOikdOyDsWrOa6rrtETYNrhOZg_EW6sUe6NK-3Ww-PkReDy0g@mail.gmail.com
Whole thread Raw
In response to Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Heap WARM Tuples - Design Draft  (Claudio Freire <klaussfreire@gmail.com>)
Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers


On Mon, Aug 8, 2016 at 11:08 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Aug  7, 2016 at 12:55:01PM -0400, Bruce Momjian wrote:
> On Sun, Aug  7, 2016 at 10:49:45AM -0400, Bruce Momjian wrote:
> > OK, crazy idea time --- what if we only do WARM chain additions when all
> > indexed values are increasing (with NULLs higher than all values)?  (If
> > a key is always-increasing, it can't match a previous value in the
> > chain.)  That avoids the problem of having to check the WARM chain,
> > except for the previous tuple, and the problem of pruning removing
> > changed rows.  It avoids having to check the index for matching key/ctid
> > values, and it prevents CREATE INDEX from having to index WARM chain
> > values.
> >
> > Any decreasing value would cause a normal tuple be created.
>
> Actually, when we add the first WARM tuple, we can mark the HOT/WARM
> chain as either all-incrementing or all-decrementing.  We would need a
> bit to indicate that.

FYI, is see at least two available tuple header bits here, 0x0800 and
0x1000:

        /*
         * information stored in t_infomask2:
         */
        #define HEAP_NATTS_MASK         0x07FF  /* 11 bits for number of attributes */
        /* bits 0x1800 are available */
        #define HEAP_KEYS_UPDATED       0x2000  /* tuple was updated and key cols
                                                 * modified, or tuple deleted */
        #define HEAP_HOT_UPDATED        0x4000  /* tuple was HOT-updated */
        #define HEAP_ONLY_TUPLE         0x8000  /* this is heap-only tuple */

        #define HEAP2_XACT_MASK         0xE000  /* visibility-related bits */

What I am currently trying to do is to reuse at least the BlockNumber field in t_ctid. For HOT/WARM chains, that field is really unused (except the last tuple when regular update needs to store block number of the new block). My idea is to use one free bit in t_infomask2 to tell us that t_ctid is really not a CTID, but contains new information (for pg_upgrade's sake). For example, one bit in bi_hi can tell us that this is the last tuple in the chain (information today conveyed by t_ctid pointing to self). Another bit can tell us that this tuple was WARM updated. We will still have plenty of bits to store additional information about WARM chains.


My guess is we would need one bit to mark a WARM chain, and perhaps
reuse obsolete pre-9.0 HEAP_MOVED_OFF to indicate increment-only or
decrement-only. 

I am not convinced that the checking for increment/decrement adds a lot of value. Sure, we might be able to address some typical work load, but is that really a common use case? Instead, what I am looking at storing a bitmap which shows us which table columns have changed so far in the WARM chain. We only have limited bits, so we can track only limited columns. This will help the cases where different columns are updated, but not so much if the same column is updated repeatedly.

What will help, and something I haven't yet applied any thoughts, is when we can turn WARM chains back to HOT by removing stale index entries.

Some heuristics and limits on amount of work done to detect duplicate index entries will help too.
 

We can't use the bits LP_REDIRECT lp_len because we need to create WARM
chains before pruning, and I don't think walking the pre-pruned chain is
worth it.  (As I understand HOT, LP_REDIRECT is only created during
pruning.)


That's correct. But lp_len provides us some place to stash information from heap tuples when they are pruned.
 
Thanks,
Pavan

 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Vladimir Sitnikov
Date:
Subject: Re: Slowness of extended protocol
Next
From: Tom Lane
Date:
Subject: Re: Slowness of extended protocol