On Tue, 26 Jun 2007, Gregory Stark wrote:
> What exactly happens if a checkpoint takes so long that the next checkpoint
> starts. Aside from it not actually helping is there much reason to avoid this
> situation? Have we ever actually tested it?
More segments get created, and because of how they are cleared at the
beginning this causes its own mini-I/O storm through the same buffered
write channel the checkpoint writes are going into (which way or may not
be the same way normal WAL writes go, depending on whether you're using
O_[D]SYNC WAL writes). I've seen some weird and intermittant breakdowns
from the contention that occurs when this happens, and it's certainly
something to be avoided.
To test it you could just use a big buffer cache, write like mad to it,
and make checkpoint_segments smaller than it should be for that workload.
It's easy enough to kill yourself exactly this way right now though, and
the fact that LDC gives you a parameter to aim this particular foot-gun
more precisely isn't a big deal IMHO as long as the documentation is
clear.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD