Re: postgresql latency & bgwriter not doing its job - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: postgresql latency & bgwriter not doing its job
Date
Msg-id alpine.DEB.2.10.1408271035050.8876@sto
Whole thread Raw
In response to Re: postgresql latency & bgwriter not doing its job  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: postgresql latency & bgwriter not doing its job
Re: postgresql latency & bgwriter not doing its job
List pgsql-hackers
> [...] What's your evidence the pacing doesn't work? Afaik it's the fsync 
> that causes the problem, not the the writes themselves.

Hmmm. My (poor) understanding is that fsync would work fine if everything 
was already written beforehand:-) that is it has nothing to do but assess 
that all is already written. If there is remaining write work, it starts 
doing it "now" with the disastrous effects I'm complaining about.

When I say "pacing does not work", I mean that things where not written 
out to disk by the OS, it does not mean that pg did not ask for it.

However it does not make much sense for an OS scheduler to wait several 
minutes with tens of thousands of pages to write and do nothing about 
it... So I'm wondering.

> [...]
>> (1) the ability to put checkpoint_timeout to values smaller than 30s could
>> help, although obviously there would be other consequences. But the ability
>> to avoid periodic offline time looks like a desirable objective.
>
> I'd rather not do that. It's a utterly horrible hack to go this write.

Hmmm. It does solve the issue, though:-) It would be the administrator 
choice. It is better than nothing, which is the current status.

>> (2) I still think that a parameter to force bgwriter to write more stuff
>> could help, but this is not tested.
>
> It's going to be random writes. That's not going to be helpful.

The -N small OLTP load on a large (GB) table *is* random writes anyway, 
whether they occur at checkpoint or at any other time. Random writes are 
fine in this case, the load is small, there should be no problem.

>> (3) Any other effective idea to configure for responsiveness is 
>> welcome!
>
> I've a couple of ideas how to improve the situation, but so far I've not
> had the time to investigate them properly. Would you be willing to test
> a couple of simple patches?

I can test a couple of patches. I already did one on someone advice (make 
bgwriter round all stuff in 1s instead of 120s, without positive effect.

> Did you test xfs already?

No. I cannot without reinstalling, which I cannot do on a remote host, and 
I will probably not have time to do it when I'll have physical access. 
Only one partition on the host. My mistake. Will not do it again. Shame on 
me.

If someone out there has an XFS setup, it is very easy to test and only 
takes a couple of minutes, really. It takes less time to do it than to 
write a mail about it afterwards:-)

I have tested FreeBSD/UFS with similar results, a few periodic offlines. 
UFS journaled file system is probably not ideal for database work, but yet 
again the load is small, it should be able to cope without going offline.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: postgresql latency & bgwriter not doing its job
Next
From: Andres Freund
Date:
Subject: Re: postgresql latency & bgwriter not doing its job