Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Heap WARM Tuples - Design Draft
Date
Msg-id 20160804170553.GL1702@momjian.us
Whole thread Raw
In response to Heap WARM Tuples - Design Draft  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Heap WARM Tuples - Design Draft  (Simon Riggs <simon@2ndquadrant.com>)
Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Thu, Aug  4, 2016 at 04:29:09PM +0530, Pavan Deolasee wrote:
> Write Amplification Reduction Method (WARM)
> ====================================
> 
> A few years back, we developed HOT to address the problem associated with MVCC
> and frequent updates and it has served us very well. But in the light of Uber's
> recent technical blog highlighting some of the problems that are still
> remaining, especially for workloads where HOT does not help to the full extent,
> Simon, myself and others at 2ndQuadrant discussed several ideas and what I
> present below is an outcome of that. This is not to take away credit from
> anybody else. Others may have thought about similar ideas, but I haven’t seen
> any concrete proposal so far.

HOT was a huge win for Postgres and I am glad you are reviewing
improvements.

> This method succeeds in reducing the write amplification, but causes other
> issues which also need to be solved. WARM breaks the invariant that all tuples
> in a HOT chain have the same index values and so an IndexScan would need to
> re-check the index scan conditions against the visible tuple returned from
> heap_hot_search(). We must still check visibility, so we only need to re-check
> scan conditions on that one tuple version.
>   
> We don’t want to force a recheck for every index fetch because that will slow
> everything down. So we would like a simple and efficient way of knowing about
> the existence of a WARM tuple in a chain and do a recheck in only those cases,
> ideally O(1). Having a HOT chain contain a WARM tuple is discussed below as
> being a “WARM chain”, implying it needs re-check.

In summary, we are already doing visibility checks on the HOT chain, so
a recheck if the heap tuple matches the index value is only done at most
on the one visible tuple in the chain, not ever tuple in the chain.

> 2. Mark the root line pointer (or the root tuple) with a special
> HEAP_RECHECK_REQUIRED flag to tell us about the presence of a WARM tuple in the
> chain. Since all indexes point to the root line pointer, it should be enough to
> just mark the root line pointer (or tuple) with this flag. 

Yes, I think #2 is the easiest.  Also, if we modify the index page, we
would have to WAL the change and possibly WAL log the full page write
of the index page.  :-(

> Approach 2 seems more reasonable and simple. 
> 
> There are only 2 bits for lp_flags and all combinations are already used. But
> for LP_REDIRECT root line pointer, we could use the lp_len field to store this
> special flag, which is not used for LP_REDIRECT line pointers. So we are able
> to mark the root line pointer.

Uh, as I understand it, we only use LP_REDIRECT when we have _removed_
the tuple that the ctid was pointing to, but it seems you would need to
set HEAP_RECHECK_REQUIRED earlier than that.

Also, what is currently in the lp_len field for LP_REDIRECT?  Zeros, or
random data?  I am asking for pg_upgrade purposes.

> One idea is to somehow do this as part of the vacuum where we collect root line
> pointers of  WARM chains during the first phase of heap scan, check the indexes
> for all such tuples (may be combined with index vacuum) and then clear the heap
> flags during the second pass, unless new tuples are added to the WARM chain. We
> can detect that by checking that all tuples in the WARM chain still have XID
> less than the OldestXmin that VACUUM is using.

Yes, it seems natural to clear the ctid HEAP_RECHECK_REQUIRED flag where
you are adjusting the HOT chain anyway.

> It’s important to ensure that the flag is set when it is absolutely necessary,
> while having false positives is not a problem. We might do a little wasteful
> work if the flag is incorrectly set. Since flag will be set only during
> heap_update() and the operation is already WAL logged, this can be piggybacked
> with the heap_update WAL record. Similarly, when a root tuple is pruned to a
> redirect line pointer, the operation is already WAL logged and we can piggyback
> setting of line pointer flag with that WAL record.
> 
> Flag clearing need not be WAL logged, unless we can piggyback that to some
> existing WAL logging.

Agreed, good point.

Very nice!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Lossy Index Tuple Enhancement (LITE)
Next
From: Simon Riggs
Date:
Subject: Re: Heap WARM Tuples - Design Draft