Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: checkpointer continuous flushing
Date
Msg-id alpine.DEB.2.10.1506220932480.23011@sto
Whole thread Raw
In response to Re: checkpointer continuous flushing  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
> It'd be interesting to see numbers for tiny, without the overly small
> checkpoint timeout value. 30s is below the OS's writeback time.

Here are some tests with longer timeout:

tiny2: scale=10 shared_buffers=1GB checkpoint_timeout=5min        max_wal_size=1GB warmup=600 time=4000
 flsh |      full speed tps      | percent of late tx, 4 clients, for tps: /srt |  1 client  |  4 clients  |  100 |
200|  400 |  800 | 1200 | 1600 N/N  | 930 +- 124 | 2560 +- 394 | 0.70 | 1.03 | 1.27 | 1.56 | 2.02 | 2.38 N/Y  | 924 +-
122| 2612 +- 326 | 0.63 | 0.79 | 0.94 | 1.15 | 1.45 | 1.67 Y/N  | 907 +- 112 | 2590 +- 315 | 0.58 | 0.83 | 0.68 | 0.71
|0.81 | 1.26 Y/Y  | 915 +- 114 | 2590 +- 317 | 0.60 | 0.68 | 0.70 | 0.78 | 0.88 | 1.13
 

There seems to be a small 1-2% performance benefit with 4 clients, this is 
reversed for 1 client, there are significantly and consistently less late 
transactions when options are activated, the performance is more stable
(standard deviation reduced by 10-18%).

The db is about 200 MB ~ 25000 pages, at 2500+ tps it is written 40 times 
over in 5 minutes, so the checkpoint basically writes everything over 220 
seconds, 0.9 MB/s. Given the preload phase the buffers may be more or less 
in order in memory, so would be written out in order.


medium2: scale=300 shared_buffers=5GB checkpoint_timeout=30min         max_wal_size=4GB warmup=1200 time=7500
 flsh |      full speed tps       | percent of late tx, 4 clients /srt |  1 client   |  4 clients  |   100 |   200 |
400|  N/N | 173 +- 289* | 198 +- 531* | 27.61 | 43.92 | 61.16 |  N/Y | 458 +- 327* | 743 +- 920* |  7.05 | 14.24 |
24.07|  Y/N | 169 +- 166* | 187 +- 302* |  4.01 | 39.84 | 65.70 |  Y/Y | 546 +- 143  | 681 +- 459  |  1.55 |  3.51 |
2.84|
 

The effect of sorting is very positive (+150% to 270% tps). On this run, 
flushing has a positive (+20% with 1 client) or negative (-8 % with 4 
clients) on throughput, and late transactions are reduced by 92-95% when 
both options are activated.

At 550 tps checkpoints are xlog-triggered and write about 1/3 of the 
database, (170000 buffers to write very 220-260 seconds, 4 MB/s).

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Hash index creation warning
Next
From: Jeff Janes
Date:
Subject: btree_gin and BETWEEN