Re: Bgwriter and pg_stat_bgwriter.buffers_clean aspects - Mailing list pgsql-performance

From Greg Smith
Subject Re: Bgwriter and pg_stat_bgwriter.buffers_clean aspects
Date
Msg-id Pine.GSO.4.64.0901070102140.8477@westnet.com
Whole thread Raw
In response to Re: Bgwriter and pg_stat_bgwriter.buffers_clean aspects  ("Dmitry Koterov" <dmitry@koterov.ru>)
List pgsql-performance
On Tue, 6 Jan 2009, Dmitry Koterov wrote:

> But why buffers_backend is so high? As I understood from your article,
> buffers_backend shows the number of writes immediately caused by any write
> operations, e.g. when an INSERT has to flush something on disk, because it
> has no space left for a new data in shared buffers. I suppose these flushes
> slow down operating greatly

In normal operation, those writes are cached by the operating system, such
that most backend writes will return very quickly.

> I realy see this: in my environment INSERT is usually performed in 1-2
> ms, but sometimes it is executed in 5-6 seconds or even more (10
> seconds)

When activity blocks like this, the most likely cause is because
everything is blocked waiting for the fsync at the end of a checkpoint
that forces all writes out to disk.  The only good way to make that go
away is to spread the checkpoint over a long period of time.  Your
configuration is forcing such a syncronization every minute, which makes
that sort of blocking more likely to happen, merely because there so many
chances for it.

Standard good practice here for 8.3 is to set checkpoint_timeout and
checkpoint_segments to as high as you can stand, where the downsides to
increasing them is that more disk space is wasted and recovery time goes
up.  I think you're chasing after the wrong cause here and ignoring the
obvious one.  Backend writes should not cause a long stall, and tuning up
the background writer to the extreme you have is counterproductive (all
your bgwriter_* parameters would be far better off at the default than the
extremely aggressive ones you've set them to).  Meanwhile, reducing
checkpoint_timeout can absolutely cause what you're seeing.

One other thing:  if this is a Linux system running a kernel before 2.6.22
and you have a lot of RAM, there's a known problem with that combination
that can cause writes to hang for a long time.  I've got a long article
about it http://www.westnet.com/~gsmith/content/linux-pdflush.htm and a
quicker run through identifying if you're running into issue and resolving
it at
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: "M. Edward (Ed) Borasky"
Date:
Subject: Re: understanding postgres issues/bottlenecks
Next
From: Simon Waters
Date:
Subject: Re: understanding postgres issues/bottlenecks