Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date
Msg-id 4739D181.7010208@enterprisedb.com
Whole thread Raw
In response to LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris  ("Jignesh K. Shah" <J.K.Shah@Sun.COM>)
List pgsql-hackers
Jignesh K. Shah wrote:
> I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO 
> flood when checkpoint happens.
> I have tried increasing the bg_lru_multiplier from 2 to 5 from default 
> but I dont see any more writes by bgwriter happening than my previous 
> test which used the default.
> 
> Then I tried increasing checkpoint_completion_target=0.9 but still no 
> spread of IO (checkpoint_timeout is set to default 5m)
> 
> What am I missing?

Two things spring to mind:

Even though the write()s are distributed, the fsyncs are still going to 
cause a spike. But it should be much less severe than before. How bad 
are the spikes you're seeing? Compared to 8.2?

Are your checkpoints triggered by checkpoint_timeout or 
checkpoint_segments? The calculation of "how much time do I have to 
finish this checkpoint before the next one is triggered" takes both into 
account, but the calculation wrt. checkpoint_segments is not very 
accurate because of full page writes. Because of full page writes, we 
write a lot more WAL right after checkpoint. That makes the load 
distribution algorithm to think it's going to run out of 
checkpoint_segments much sooner than it actually does. Raising 
checkpoint_segments will help with that.

BTW, please turn on checkpoint_logging.

> How does PostgreSQL determine the Load distribution?

First, when the checkpoint starts, shared buffer pool is scanned, dirty 
buffers are counted. Then bgwriter starts to write the buffers, 
constantly estimating how much work it has left, and how much time it 
has until next checkpoint. The write rate is throttled so that the 
remaining writes are distributed evenly across the time remaining 
(checkpoint_completion_target is a fuzz-factor applied to the estimate 
of how much time is remaining).

Hope this helps..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unnecessary casts in pg_cast
Next
From: Tom Lane
Date:
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris