Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Arthur Silva
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAO_YK0VJYH-bc616y=O5S9FnbZ8vCDpkzAUeL+tRzopvxWVMGQ@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers
On Wed, Dec 10, 2014 at 12:10 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>What I would suggest is instrument the backend with getrusage() at
>startup and shutdown and have it print the difference in user time and
>system time.  Then, run tests for a fixed number of transactions and
>see how the total CPU usage for the run differs.

Folllowing are the numbers obtained on tests with absolute CPU usage, fixed number of transactions and longer duration with latest fpw compression patch  

pgbench command : pgbench  -r -t 250000 -M prepared 

To ensure that data is not highly compressible, empty filler columns were altered using 

alter table pgbench_accounts alter column filler type text using 

gen_random_uuid()::text  

checkpoint_segments = 1024              
checkpoint_timeout =  5min 
fsync = on

The tests ran for around 30 mins.Manual checkpoint was run before each test.

Compression   WAL generated    %compression    Latency-avg   CPU usage (seconds)                                          TPS              Latency stddev               


on                  1531.4 MB          ~35 %                  7.351 ms        user diff: 562.67s     system diff: 41.40s              135.96               13.759 ms


off                  2373.1 MB                                     6.781 ms           user diff: 354.20s      system diff: 39.67s            147.40               14.152 ms

The compression obtained is quite high close to 35 %.
CPU usage at user level when compression is on is quite noticeably high as compared to that when compression is off. But gain in terms of reduction of WAL is also high.

Server specifications:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDD      450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm



Thank you,

Rahila Syed





On Fri, Dec 5, 2014 at 10:38 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Dec 5, 2014 at 1:49 AM, Rahila Syed <rahilasyed.90@gmail.com> wrote:
>>If that's really true, we could consider having no configuration any
>>time, and just compressing always.  But I'm skeptical that it's
>>actually true.
>
> I was referring to this for CPU utilization:
> http://www.postgresql.org/message-id/1410414381339-5818552.post@n5.nabble.com
> <http://>
>
> The above tests were performed on machine with configuration as follows
> Server specifications:
> Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
> RAM: 32GB
> Disk : HDD      450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
> 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

I think that measurement methodology is not very good for assessing
the CPU overhead, because you are only measuring the percentage CPU
utilization, not the absolute amount of CPU utilization.  It's not
clear whether the duration of the tests was the same for all the
configurations you tried - in which case the number of transactions
might have been different - or whether the number of operations was
exactly the same - in which case the runtime might have been
different.  Either way, it could obscure an actual difference in
absolute CPU usage per transaction.  It's unlikely that both the
runtime and the number of transactions were identical for all of your
tests, because that would imply that the patch makes no difference to
performance; if that were true, you wouldn't have bothered writing
it....

What I would suggest is instrument the backend with getrusage() at
startup and shutdown and have it print the difference in user time and
system time.  Then, run tests for a fixed number of transactions and
see how the total CPU usage for the run differs.

Last cycle, Amit Kapila did a bunch of work trying to compress the WAL
footprint for updates, and we found that compression was pretty darn
expensive there in terms of CPU time.  So I am suspicious of the
finding that it is free here.  It's not impossible that there's some
effect which causes us to recoup more CPU time than we spend
compressing in this case that did not apply in that case, but the
projects are awfully similar, so I tend to doubt it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


This can be improved in the future by using other algorithms.

pgsql-hackers by date:

Previous
From: Dennis Kögel
Date:
Subject: Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
Next
From: Robert Haas
Date:
Subject: Re: advance local xmin more aggressively