Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: checkpointer continuous flushing
Date
Msg-id CAA4eK1Jmk34XPQmXxTrUDQ46CByiG8se=dtMEabK7E7k1rxPFA@mail.gmail.com
Whole thread Raw
In response to Re: checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
On Mon, Aug 24, 2015 at 12:45 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>
> Also check the file:
>
>   sh> file ./avg.py
>   ./avg.py: Python script, UTF-8 Unicode text executable
>

There were some CRLF line terminators, after removing those, it worked
fine and here are the results of some of the tests done for sorting patch
(checkpoint-continuous-flush-10-a) :

Config Used
----------------------
M/c details

--------------------IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB


Test details
------------------
warmup=60
scale=300
max_connections=150
shared_buffers=8GB
checkpoint_timeout=2min
time=7200
synchronous_commit=on
max_wal_size=5GB

parallelism - 128 clients, 128 threads

Sort - off
avg over 7200: 8256.382528 ± 6218.769282 [0.000000, 76.050000, 10975.500000, 13105.950000, 21729.000000]
percent of values below 10.0: 19.5%

Sort - on
avg over 7200: 8375.930639 ± 6148.747366 [0.000000, 84.000000, 10946.000000, 13084.000000, 20289.900000]
percent of values below 10.0: 18.6%

Before going to conclusion, let me try to explain above data (I am
explaining again even though Fabien has explained, to make it clear
if someone has not read his mail)

Let's try to understand with data for sorting - off option

avg over 7200: 8256.382528 ± 6218.769282

8256.382528 - average tps for 7200s pgbench run 
6218.769282 - standard deviation on per second figures

[0.000000, 84.000000, 10946.000000, 13084.000000, 20289.900000]

These 5 values can be read as minimum TPS, q1, median TPS, q3,
maximum TPS over 7200s pgbench run.  As far as I understand q1
and q3 median of subset of values which I didn't focussed much.

percent of values below 10.0: 19.5%

Above means percent of time the result is below 10 tps.

Now about test results, these tests are done for pgbench full speed runs
and the above results indicate that there is approximately 1.5%
improvement in avg. TPS and ~1% improvement in tps values which are
below 10 with sorting on and there is almost no improvement in median or
maximum TPS values, instead they or slightly less when sorting is
on which could be due to run-to-run variation.

I have done more tests as well by varying time and number of clients
keeping other configuration same as above, but the results are quite
similar.

The results of sorting patch for the tests done indicate that the win is not
big enough with just doing sorting during checkpoints, we should consider
flush patch along with sorting.  I would like to perform some tests with both
the patches together (sort + flush) unless somebody else thinks that sorting
patch alone is beneficial and we should test some other kind of scenarios to
see it's benefit.

>
> The reason for the tablespace balancing is that in the current postgres buffers are written more or less randomly, so it is (probably) implicitely and statistically balanced over tablespaces because of this randomness, and indeed, AFAIK, people with multi tablespace setup have not complained that postgres was using the disks sequentially.
>
> However, once the buffers are sorted per file, the order becomes deterministic and there is no more implicit balancing, which means that if someone has a pg setup with several disks it will write sequentially on these instead of in parallel.
>

What if tablespaces are not on separate disks or not enough hardware
support to make Writes parallel?  I think for such cases it might be
better to do it sequentially.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Horizontal scalability/sharding
Next
From: Pavel Stehule
Date:
Subject: Re: On-demand running query plans using auto_explain and signals