Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Performance Improvement by reducing WAL for Update Operation |
Date | |
Msg-id | CAA4eK1LGoDL_P6fq3j+RdjT829Fg-Y5or5MYquLXCYXLeF7_eA@mail.gmail.com Whole thread Raw |
In response to | Re: Performance Improvement by reducing WAL for Update Operation (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Performance Improvement by reducing WAL for Update Operation
|
List | pgsql-hackers |
On Fri, Jan 31, 2014 at 1:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jan 31, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Thu, Jan 30, 2014 at 12:23 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> On Wed, Jan 29, 2014 at 8:13 PM, Heikki Linnakangas >>> <hlinnakangas@vmware.com> wrote: >>> >>> After basic verification of back-to-pglz-like-delta-encoding-1, I will >>> take the data with both the patches and report the same. >> >> I have corrected the problems reported in back-to-pglz-like-delta-encoding-1 >> and removed hindex from pgrb_delta_encoding_v6 and attached are >> new versions of both patches. >> >> I/O Reduction Data >> ----------------------------- >> Non-Default settings >> autovacuum = off >> checkpoitnt_segments = 256 >> checkpoint_timeout =15min >> >> Observations >> -------------------- >> 1. With both the patches WAL reduction is similar i.e ~37% for >> "one short and one long field, no change" and 12% for >> "hundred tiny fields, half nulled" >> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best >> case "one short and one long field, no change". >> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead >> for cases where there is no match >> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead >> for "hundred tiny fields, half nulled" case >> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except >> for "hundred tiny fields, half nulled" where CPU overhead is much more. >> >> I think the main reason for overhead is that we store last offset >> of matching data in history at front, so during match, it has to traverse back >> many times to find longest possible match and in real world it won't be the >> case that most of history entries contain same hash index, so it should not >> effect. > > If we want to improve CPU usage for cases like "hundred tiny fields, > half nulled" > (which I think is not important), forming history table by traversing from end > rather than beginning, can serve the purpose, I have not tried it but I think > it can certainly help. I had implemented the above idea of forming the history table by traversing the old tuple from end instead of from beginning and had done some optimizations in find match for breaking the loop based on good match concept similar to pglz. The advantage of this is that we can find longer matches quickly and due to which even for case "hundred tiny fields, half nulled", now there is no CPU overhead without having any significant effect on any other case. Please find the updated patch attached with mail and new data as below: Non-Default settings --------------------------------- autovacuum = off checkpoitnt_segments = 256 checkpoint_timeout =15min Unpatched testname | wal_generated | duration ------------------------------------------------------+---------------+------------------ one short and one long field, no change | 1055025424 | 14.3506939411163 one short and one long field, no change | 1056580160 | 18.1261160373688 one short and one long field, no change | 1054914792 | 15.104973077774 hundred tiny fields, all changed | 636948992 | 16.3172590732574 hundred tiny fields, all changed | 633943680 | 16.308168888092 hundred tiny fields, all changed | 636516776 | 16.4316298961639 hundred tiny fields, half changed | 633948288 | 16.5795118808746 hundred tiny fields, half changed | 636068648 | 16.2913551330566 hundred tiny fields, half changed | 635848432 | 15.9602961540222 hundred tiny fields, half nulled | 569758744 | 15.9501180648804 hundred tiny fields, half nulled | 569760112 | 15.9422838687897 hundred tiny fields, half nulled | 570609712 | 16.5659689903259 nine short and one long field, thirty % change | 698908824 | 12.7938749790192 nine short and one long field, thirty % change | 698905400 | 12.0160901546478 nine short and one long field, thirty % change | 698909720 | 12.2999179363251 After pgrb_delta_encoding_v8.patch ---------------------------------------------------------- testname | wal_generated | duration ------------------------------------------------------+---------------+------------------ one short and one long field, no change | 680203392 | 12.4820687770844 one short and one long field, no change | 677340120 | 11.8634090423584 one short and one long field, no change | 677333288 | 11.9269840717316 hundred tiny fields, all changed | 633950264 | 16.7694170475006 hundred tiny fields, all changed | 635496520 | 16.9294109344482 hundred tiny fields, all changed | 633942832 | 18.0690770149231 hundred tiny fields, half changed | 633948024 | 17.0814690589905 hundred tiny fields, half changed | 633947488 | 17.0073189735413 hundred tiny fields, half changed | 633949224 | 17.0454230308533 hundred tiny fields, half nulled | 499950184 | 16.3303508758545 hundred tiny fields, half nulled | 499952888 | 15.7197980880737 hundred tiny fields, half nulled | 499958120 | 15.7198679447174 nine short and one long field, thirty % change | 559831384 | 12.0672481060028 nine short and one long field, thirty % change | 559829472 | 11.8555760383606 nine short and one long field, thirty % change | 559832760 | 11.9470820426941 Observations are almost similar as previous except for "hundred tiny fields, half nulled" case which I have updated below: >> Observations >> -------------------- >> 1. With both the patches WAL reduction is similar i.e ~37% for >> "one short and one long field, no change" and 12% for >> "hundred tiny fields, half nulled" >> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best >> case "one short and one long field, no change". >> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead >> for cases where there is no match >> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead >> for "hundred tiny fields, half nulled" case Now there is approximately 1.4~5% CPU gain for "hundred tiny fields, half nulled" case >> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except >> for "hundred tiny fields, half nulled" where CPU overhead is much more. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: