Re: postgresql latency & bgwriter not doing its job - Mailing list pgsql-hackers

From Andres Freund
Subject Re: postgresql latency & bgwriter not doing its job
Date
Msg-id 20140827102843.GE21544@awork2.anarazel.de
Whole thread Raw
In response to Re: postgresql latency & bgwriter not doing its job  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: postgresql latency & bgwriter not doing its job
List pgsql-hackers
On 2014-08-27 11:19:22 +0200, Andres Freund wrote:
> On 2014-08-27 11:14:46 +0200, Andres Freund wrote:
> > On 2014-08-27 11:05:52 +0200, Fabien COELHO wrote:
> > > I can test a couple of patches. I already did one on someone advice (make
> > > bgwriter round all stuff in 1s instead of 120s, without positive effect.
> > 
> > I've quickly cobbled together the attached patch (which at least doesn't
> > seem to crash & burn). It tries to trigger pages being flushed out
> > during the paced phase of checkpoints instead of the fsync phase. The
> > sync_on_checkpoint_flush can be used to enable/disable that behaviour.
> > 
> > I'd be interested to hear whether that improves your latency numbers. I
> > unfortunately don't have more time to spend on this right now :(.
> 
> And actually attached. Note that it's linux only...

I got curious and ran a quick test:

config:
log_checkpoints=on
checkpoint_timeout=1min
checkpoint_completion_target=0.95
checkpoint_segments=100
synchronous_commit=on
fsync=on
huge_pages=on
max_connections=200
shared_buffers=6GB
wal_level=hot_standby

off:

$ pgbench -p 5440 -h /tmp postgres -M prepared -c 16 -j16 -T 120 -R 180 -L 200
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 150
query mode: prepared
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 20189
latency average: 23.136 ms
latency stddev: 59.044 ms
rate limit schedule lag: avg 4.599 (max 199.975) ms
number of skipped transactions: 1345 (6.246 %)
tps = 167.664290 (including connections establishing)
tps = 167.675679 (excluding connections establishing)

LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 12754 buffers (1.6%); 0 transaction log file(s) added, 0 removed, 2 recycled;
write=56.928s, sync=3.639 s, total=60.749 s; sync files=20, longest=2.741 s, average=0.181 s
 
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 12269 buffers (1.6%); 0 transaction log file(s) added, 0 removed, 6 recycled;
write=20.701s, sync=8.568 s, total=29.444 s; sync files=10, longest=3.568 s, average=0.856 s
 


on:

$ pgbench -p 5440 -h /tmp postgres -M prepared -c 16 -j16 -T 120 -R 180 -L 200
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 150
query mode: prepared
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 21327
latency average: 20.735 ms
latency stddev: 14.643 ms
rate limit schedule lag: avg 4.965 (max 185.003) ms
number of skipped transactions: 1 (0.005 %)
tps = 177.214391 (including connections establishing)
tps = 177.225476 (excluding connections establishing)

LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 12217 buffers (1.6%); 0 transaction log file(s) added, 0 removed, 1 recycled;
write=57.022s, sync=0.203 s, total=57.377 s; sync files=19, longest=0.033 s, average=0.010 s
 
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 13185 buffers (1.7%); 0 transaction log file(s) added, 0 removed, 6 recycled;
write=56.628s, sync=0.019 s, total=56.803 s; sync files=11, longest=0.017 s, average=0.001 s
 


That machine is far from idle right now, so the noise is pretty
high. But rather nice initial results.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: pgbench throttling latency limit
Next
From: Magnus Hagander
Date:
Subject: Re: re-reading SSL certificates during server reload