Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Performance Improvement by reducing WAL for Update Operation |
Date | |
Msg-id | CAA4eK1JeUbY16uwrDA2TaBkk+rLRL3Giyyqy1mVh_6CThmDR8w@mail.gmail.com Whole thread Raw |
In response to | Re: Performance Improvement by reducing WAL for Update Operation (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Performance Improvement by reducing WAL for Update
Operation
Re: Performance Improvement by reducing WAL for Update Operation |
List | pgsql-hackers |
On Fri, Nov 29, 2013 at 3:05 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Nov 27, 2013 at 9:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> Sure, but to explore (a), the scope is bit bigger. We have below >> options to explore (a): >> 1. try to optimize existing algorithm as used in patch, which we have >> tried but ofcourse we can spend some more time to see if anything more >> can be tried out. >> 2. try fingerprint technique as suggested by you above. >> 3. try some other standard methods like vcdiff, lz4 etc. > > Well, obviously, I'm hot on idea #2 and think that would be worth > spending some time on. If we can optimize the algorithm used in the > patch some more (option #1), that would be fine, too, but the code > looks pretty tight to me, so I'm not sure how successful that's likely > to be. But if you have an idea, sure. I have been experimenting chunk wise delta encoding (by using technique similar to rabin fingerprint method) from last few days and here are results of my investigation. Performance Data ---------------------------- Non-default settings: autovacuum =off checkpoint_segments =128 checkpoint_timeout = 10min unpatched testname | wal_generated | duration -----------------------------------------+---------------+------------------ one short and one long field, no change | 1054921328 | 25.5855557918549 hundred tiny fields, all changed | 634483328 | 20.8992719650269 hundred tiny fields, half changed | 635948640 | 19.8670389652252 hundred tiny fields, half nulled | 571388552 | 18.9413228034973 lz-delta-encoding testname | wal_generated | duration -----------------------------------------+---------------+------------------ one short and one long field, no change | 662984384 | 21.7335519790649 hundred tiny fields, all changed | 633944320 | 24.1207830905914 hundred tiny fields, half changed | 633944344 | 24.4657719135284 hundred tiny fields, half nulled | 492200208 | 22.0337791442871 rabin-delta-encoding testname | wal_generated | duration -----------------------------------------+---------------+------------------ one short and one long field, no change | 662235752 | 20.1823079586029 hundred tiny fields, all changed | 633950080 | 22.0473308563232 hundred tiny fields, half changed | 633950880 | 21.8351459503174 hundred tiny fields, half nulled | 508943072 | 20.9554698467255 Results Summarization ------------------------------------- 1. With Chunkwise approach, WAL reduction is almost same as with LZ barring half nulled case which can be improved. 2. With Chunkwise approach, CPU usage is reduced to 50% in most cases where there is less or no compression, still there is 5~10% overhead for cases where data is not compressible. I think there will certainly a small overhead of forming hash table and scanning to conclude data is non-compressible. 3. I have not tested other tests which will anyway return from top of encoding function due to tuple length less than 32. Main reasons of improvement --------------------------------------------- 1. lesser hash entries for old tuple and lesser calculations during compressing of new tuple. 2. memset for data structure related to hash table for lesser size 3. Don't copy into output buffer untill we found match. Further Actions ------------------------ 1. Need to decide if this reduction in CPU usage is acceptable, do we need enable/disable flag at table level. 2. We can do further micro-optimisations in chunk wise approach like hash function improvement. 3. Some code improvements are pending like for cases where data to be compressed is non-contiguous. Attached files --------------------- 1. pgrb_delta_encoding_v1 - In heaptuple.c, there is a parameter rabin_fingerprint_comp, set it to true for chunkwise delta encoding and set it to false for lz encoding. By default it is true. I wanted to provide better way to enable both modes and tried as well but end up with this way. 2. wal-update-testsuite.sh - test script developed by Heikki to test this patch. Note - a. Performance is data is taken on my laptop, needs to be tested on some better m/c b. Attached Patch is just a prototype of chunkwise concept, code needs to be improved and decode handling/test is pending. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: