Re: Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CABOikdPuDh9w-LvNLZe4ECB87Ce=QbUEOeHw9YvunfaQu_CftQ@mail.gmail.com
Whole thread Raw
In response to Re: Patch: Write Amplification Reduction Method (WARM)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers


On Thu, Mar 30, 2017 at 5:27 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:


How have you verified that?  Have you checked that in
heap_prepare_insert it has called toast_insert_or_update() and then
returned a tuple different from what the input tup is?  Basically, I
am easily able to see it and even the reason why the heap and index
tuples will be different.  Let me try to explain,
toast_insert_or_update returns a new tuple which contains compressed
data and this tuple is inserted in heap where as slot still refers to
original tuple (uncompressed one) which is passed to heap_insert.
Now, ExecInsertIndexTuples and the calls under it like FormIndexDatum
will refer to the tuple in slot which is uncompressed and form the
values[] using uncompressed value.

Ah, yes. You're right. Not sure why I saw things differently. That doesn't anything though because during recheck we'll get compressed value and not do anything with it. In the index we already have compressed value and we can compare them. Even if we decide to decompress everything and do the comparison, that should be possible. So I don't see a problem as far as correctness goes.



So IIUC, in above test during initialization you have one WARM update
and then during actual test all are HOT updates, won't in such a case
the WARM chain will be converted to HOT by vacuum and then all updates
from thereon will be HOT and probably no rechecks?

There is no AV.. Just 1 tuple being HOT updated out of 100 tuples. Confirmed by looking at pg_stat_user_tables. Also made sure that the tuple doesn't get non-HOT updated in between, thus breaking the WARM chain.
 


>
> I then also repeated the tests, but this time using compressible values. The
> regression in this case is much higher, may be 15% or more.
>

Sounds on higher side.


Yes, definitely. If we can't reduce that, we might want to provide table level option to explicitly turn WARM off on such tables.
 
IIUC, by the time you are comparing tuple attrs to check for modified
columns, you don't have the compressed values for new tuple.


I think it depends. If the value is not being modified, then we will get both values as compressed. At least I confirmed with your example and running an update which only changes c1. Don't know if that holds for all cases.
 
>  I know you had
> raised concerns, but Robert confirmed that (IIUC) it's not a problem today.
>

Yeah, but I am not sure if we can take Robert's statement as some sort
of endorsement for what the patch does.


Sure. 
 
> We will figure out how to deal with it if we ever add support for different
> compression algorithms or compression levels. And I also think this is kinda
> synthetic use case and the fact that there is not much regression with
> indexes as large as 2K bytes seems quite comforting to me.
>

I am not sure if we can consider it as completely synthetic because we
might see some similar cases for json datatypes.  Can we once try to
see the impact when the same test runs from multiple clients?

Ok. Might become hard to control HOT behaviour though. Or will need to do mix of WARM/HOT updates. Will see if this is something easily doable by setting high FF etc.
 
  For
your information, I am also trying to setup some tests along with one
of my colleague and we will report the results once the tests are
complete.


That'll be extremely helpful, especially if its a something close to real-world scenario. Thanks for doing that.

Thanks,
Pavan 

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Jesper Pedersen
Date:
Subject: Re: Page Scan Mode in Hash Index
Next
From: Stephen Frost
Date:
Subject: Re: [PATCH] Reduce src/test/recovery verbosity