Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Performance Improvement by reducing WAL for Update Operation
Date
Msg-id CAA4eK1LGoDL_P6fq3j+RdjT829Fg-Y5or5MYquLXCYXLeF7_eA@mail.gmail.com
Whole thread Raw
In response to Re: Performance Improvement by reducing WAL for Update Operation  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Performance Improvement by reducing WAL for Update Operation  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Jan 31, 2014 at 1:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Jan 31, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Thu, Jan 30, 2014 at 12:23 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Wed, Jan 29, 2014 at 8:13 PM, Heikki Linnakangas
>>> <hlinnakangas@vmware.com> wrote:
>>>
>>> After basic verification of  back-to-pglz-like-delta-encoding-1, I will
>>> take the data with both the patches and report the same.
>>
>> I have corrected the problems reported in back-to-pglz-like-delta-encoding-1
>> and removed hindex from pgrb_delta_encoding_v6 and attached are
>> new versions of both patches.
>>
>> I/O Reduction Data
>> -----------------------------
>> Non-Default settings
>> autovacuum = off
>> checkpoitnt_segments = 256
>> checkpoint_timeout =15min
>>
>> Observations
>> --------------------
>> 1. With both the patches WAL reduction is similar i.e ~37% for
>>     "one short and one long field, no change" and 12% for
>>     "hundred tiny fields, half nulled"
>> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best
>>     case "one short and one long field, no change".
>> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead
>>     for cases where there is no match
>> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead
>>     for "hundred tiny fields, half nulled" case
>> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except
>>     for "hundred tiny fields, half nulled" where CPU overhead is much more.
>>
>> I think the main reason for overhead is that we store last offset
>> of matching data in history at front, so during match, it has to traverse back
>> many times to find longest possible match and in real world it won't be the
>> case that most of history entries contain same hash index, so it should not
>> effect.
>
> If we want to improve CPU usage for cases like "hundred tiny fields,
> half nulled"
> (which I think is not important), forming history table by traversing from end
> rather than beginning, can serve the purpose, I have not tried it but I think
> it can certainly help.

I had implemented the above idea of forming the history table by traversing
the old tuple from end instead of from beginning and had done some
optimizations in find match for breaking the loop based on good match
concept similar to pglz. The advantage of this is that we can find longer
matches quickly and due to which even for case "hundred tiny fields,
half nulled", now there is no CPU overhead without having any
significant effect on any other case.

Please find the updated patch attached with mail and new
data as below:

Non-Default settings
---------------------------------
autovacuum = off
checkpoitnt_segments = 256
checkpoint_timeout =15min

Unpatched

                      testname                       | wal_generated |
    duration

------------------------------------------------------+---------------+------------------
 one short and one long field, no change              |    1055025424
| 14.3506939411163
 one short and one long field, no change              |    1056580160
| 18.1261160373688
 one short and one long field, no change              |    1054914792
|  15.104973077774
 hundred tiny fields, all changed                          |
636948992 | 16.3172590732574
 hundred tiny fields, all changed                          |
633943680 |  16.308168888092
 hundred tiny fields, all changed                          |
636516776 | 16.4316298961639
 hundred tiny fields, half changed                        |
633948288 | 16.5795118808746
 hundred tiny fields, half changed                        |
636068648 | 16.2913551330566
 hundred tiny fields, half changed                        |
635848432 | 15.9602961540222
 hundred tiny fields, half nulled                            |
569758744 | 15.9501180648804
 hundred tiny fields, half nulled                            |
569760112 | 15.9422838687897
 hundred tiny fields, half nulled                            |
570609712 | 16.5659689903259
 nine short and one long field, thirty % change      |     698908824 |
12.7938749790192
 nine short and one long field, thirty % change      |     698905400 |
12.0160901546478
 nine short and one long field, thirty % change      |     698909720 |
12.2999179363251


After pgrb_delta_encoding_v8.patch
----------------------------------------------------------
                       testname                       | wal_generated
|     duration
------------------------------------------------------+---------------+------------------
 one short and one long field, no change              |     680203392
| 12.4820687770844
 one short and one long field, no change              |     677340120
| 11.8634090423584
 one short and one long field, no change              |     677333288
| 11.9269840717316
 hundred tiny fields, all changed                          |
633950264 | 16.7694170475006
 hundred tiny fields, all changed                          |
635496520 | 16.9294109344482
 hundred tiny fields, all changed                          |
633942832 | 18.0690770149231
 hundred tiny fields, half changed                        |
633948024 | 17.0814690589905
 hundred tiny fields, half changed                        |
633947488 | 17.0073189735413
 hundred tiny fields, half changed                        |
633949224 | 17.0454230308533
 hundred tiny fields, half nulled                            |
499950184 | 16.3303508758545
 hundred tiny fields, half nulled                            |
499952888 | 15.7197980880737
 hundred tiny fields, half nulled                            |
499958120 | 15.7198679447174
 nine short and one long field, thirty % change      |     559831384 |
12.0672481060028
 nine short and one long field, thirty % change      |     559829472 |
11.8555760383606
 nine short and one long field, thirty % change      |     559832760 |
11.9470820426941

Observations are almost similar as previous except for
"hundred tiny fields, half nulled" case which I have updated below:

>> Observations
>> --------------------
>> 1. With both the patches WAL reduction is similar i.e ~37% for
>>     "one short and one long field, no change" and 12% for
>>     "hundred tiny fields, half nulled"
>> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best
>>     case "one short and one long field, no change".
>> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead
>>     for cases where there is no match
>> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead
>>     for "hundred tiny fields, half nulled" case

Now there is approximately 1.4~5% CPU gain for
"hundred tiny fields, half nulled" case

>> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except
>>     for "hundred tiny fields, half nulled" where CPU overhead is much more.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: narwhal and PGDLLIMPORT
Next
From: "Joshua D. Drake"
Date:
Subject: Re: narwhal and PGDLLIMPORT