Re: postgresql latency & bgwriter not doing its job - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: postgresql latency & bgwriter not doing its job
Date
Msg-id alpine.DEB.2.10.1408260733240.4394@sto
Whole thread Raw
In response to Re: postgresql latency & bgwriter not doing its job  (Josh Berkus <josh@agliodbs.com>)
Responses Re: postgresql latency & bgwriter not doing its job
List pgsql-hackers
Hello Josh,

> So I think that you're confusing the roles of bgwriter vs. spread
> checkpoint.  What you're experiencing above is pretty common for
> nonspread checkpoints on slow storage (and RAID5 is slow for DB updates,
> no matter how fast the disks are), or for attempts to do spread
> checkpoint on filesystems which don't support it (e.g. Ext3, HFS+).  In
> either case, what's happening is that the *OS* is freezing all logical
> and physical IO while it works to write out all of RAM, which makes me
> suspect you're using Ext3 or HFS+.

I'm using ext4 on debian wheezy with postgresqk 9.4b2.

I agree that the OS may be able to help, but this aspect does not 
currently work for me at all out of the box. The "all of RAM" is really a 
few thousands 8 kB pages written randomly, a few dozen MB.

Also, if pg needs advanced OS tweaking to handle a small load, ISTM that 
it fails at simplicity:-(

As for checkpoint spreading, raising checkpoint_completion_target to 0.9 
degrades the situation (20% of transactions are more than 200 ms late 
instead of 10%, bgwriter wrote less that 1 page per second, on on 500s 
run). So maybe there is a bug here somewhere.

> Making the bgwriter more aggressive adds a significant risk of writing
> the same pages multiple times between checkpoints, so it's not a simple fix.

Hmmm... This must be balanced with the risk of being offline. Not all 
people are interested in throughput at the price of latency, so there 
could be settings that help latency, even at the price of reducing 
throughput (average tps). After that, it is the administrator choice to 
set pg for higher throughput or lower latency.

Note that writing some "least recently used" page multiple times does not 
seems to be any issue at all for me under small/medium load, especially as 
the system has nothing else to do: if you have nothing else to do, there 
is no cost in writing a page, even if you may have to write it again some 
time later, and it helps prevent dirty pages accumulation. So it seems to 
me that pg can help, it is not only/merely an OS issue.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Fabrízio de Royes Mello
Date:
Subject: Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Escaping from blocked send() reprised.