Re: Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CAGTBQpYXwvJ9TFE6npH=En9OziCqBFZ23K-trspQC4i25Gs3LQ@mail.gmail.com
Whole thread Raw
In response to Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers

On Wed, Aug 31, 2016 at 1:45 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
We discussed a few ideas to address the "Duplicate Scan" problem. For example, we can teach Index AMs to discard any duplicate (key, CTID) insert requests. Or we could guarantee uniqueness by either only allowing updates in one lexical order. While the former is a more complete solution to avoid duplicate entries, searching through large number of keys for non-unique indexes could be a drag on performance. The latter approach may not be sufficient for many workloads. Also tracking increment/decrement for many indexes will be non-trivial.

There is another problem with allowing many index entries pointing to the same WARM chain. It will be non-trivial to know how many index entries are currently pointing to the WARM chain and index/heap vacuum will throw up more challenges.

Instead, what I would like to propose and the patch currently implements is to restrict WARM update to once per chain. So the first non-HOT update to a tuple or a HOT chain can be a WARM update. The chain can further be HOT updated any number of times. But it can no further be WARM updated. This might look too restrictive, but it can still bring down the number of regular updates by almost 50%. Further, if we devise a strategy to convert a WARM chain back to HOT chain, it can again be WARM updated. (This part is currently not implemented). A good side effect of this simple strategy is that we know there can maximum two index entries pointing to any given WARM chain.

We should probably think about coordinating with my btree patch.

From the description above, the strategy is quite readily "upgradable" to one in which the indexam discards duplicate (key,ctid) pairs and that would remove the limitation of only one WARM update... right?

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_sequence catalog
Next
From: Andres Freund
Date:
Subject: Re: pg_sequence catalog