Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CABOikdOdvqJXPATNqmyJMMpYHesQuZm-Y4Q9O033pOwoAyrOjA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Tue, Mar 21, 2017 at 5:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 21, 2017 at 6:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Hmm, that test case isn't all that synthetic.  It's just a single
>> column bulk update, which isn't anything all that crazy, and 5-10%
>> isn't nothing.
>>
>> I'm kinda surprised it made that much difference, though.
>>
>
> I think it is because heap_getattr() is not that cheap.  We have
> noticed the similar problem during development of scan key push down
> work [1].

Yeah.  So what's the deal with this?  Is somebody working on figuring
out a different approach that would reduce this overhead?  Are we
going to defer WARM to v11?  Or is the intent to just ignore the 5-10%
slowdown on a single column update and commit everything anyway? 

I think I should clarify something. The test case does a single column update, but it also has columns which are very wide, has an index on many columns (and it updates a column early in the list). In addition, in the test Mithun updated all 10million rows of the table in a single transaction, used UNLOGGED table and fsync was turned off. 

TBH I see many artificial scenarios here. It will be very useful if he can rerun the query with some of these restrictions lifted. I'm all for addressing whatever we can, but I am not sure if this test demonstrates a real world usage.

Having said that, may be if we can do a few things to reduce the overhead.

- Check if the page has enough free space to perform a HOT/WARM update. If not, don't look for all index keys.
- Pass bitmaps separately for each index and bail out early if we conclude neither HOT nor WARM is possible. In this case since there is just one index and as soon as we check the second column we know neither HOT nor WARM is possible, we will return early. It might complicate the API a lot, but I can give it a shot if that's what is needed to make progress.

Any other ideas?

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Emre Hasegeli
Date:
Subject: Re: [HACKERS] BRIN cost estimate
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers