Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Performance Improvement by reducing WAL for Update Operation
Date
Msg-id 52F223E0.6030306@vmware.com
Whole thread Raw
In response to Re: Performance Improvement by reducing WAL for Update Operation  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Performance Improvement by reducing WAL for Update Operation  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Performance Improvement by reducing WAL for Update Operation  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 02/05/2014 07:54 AM, Amit Kapila wrote:
> On Tue, Feb 4, 2014 at 11:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Now there is approximately 1.4~5% CPU gain for
>>> "hundred tiny fields, half nulled" case
>
>> Assuming that the logic isn't buggy, a point in need of further study,
>> I'm starting to feel like we want to have this.  And I might even be
>> tempted to remove the table-level off switch.
>
> I have tried to stress on worst case more, as you are thinking to
> remove table-level switch and found that even if we increase the
> data by approx. 8 times ("ten long fields, all changed", each field contains
> 80 byte data), the CPU overhead is still < 5% which clearly shows that
> the overhead doesn't increase much even if the length of unmatched data
> is increased by much larger factor.
> So the data for worst case adds more weight to your statement
> ("remove table-level switch"), however there is no harm in keeping
> table-level option with default as 'true' and if some users are really sure
> the updates in their system will have nothing in common, then they can
> make this new option as 'false'.
>
> Below is data for the new case " ten long fields, all changed" added
> in attached script file:

That's not the worst case, by far.

First, note that the skipping while scanning new tuple is only performed
in the first loop. That means that as soon as you have a single match,
you fall back to hashing every byte. So for the worst case, put one
4-byte field as the first column, and don't update it.

Also, I suspect the runtimes in your test were dominated by I/O. When I
scale down the number of rows involved so that the whole test fits in
RAM, I get much bigger differences with and without the patch. You might
also want to turn off full_page_writes, to make the effect clear with
less data.

So, I came up with the attached worst case test, modified from your
latest test suite.

unpatched:


                testname               | wal_generated |     duration
--------------------------------------+---------------+------------------
  ten long fields, all but one changed |     343385312 | 2.20806908607483
  ten long fields, all but one changed |     336263592 | 2.18997097015381
  ten long fields, all but one changed |     336264504 | 2.17843413352966
(3 rows)


pgrb_delta_encoding_v8.patch:

                testname               | wal_generated |     duration
--------------------------------------+---------------+------------------
  ten long fields, all but one changed |     338356944 | 3.33501315116882
  ten long fields, all but one changed |     344059272 | 3.37364101409912
  ten long fields, all but one changed |     336257840 | 3.36244201660156
(3 rows)

So with this test, the overhead is very significant.

With the skipping logic, another kind of "worst case" case is that you
have a lot of similarity between the old and new tuple, but you miss it
because you skip. For example, if you change the first few columns, but
leave a large text column at the end of the tuple unchanged.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Fix picksplit with nan values
Next
From: Heikki Linnakangas
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation