Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Redesigning checkpoint_segments
Date
Msg-id 51AF84F4.4000504@vmware.com
Whole thread Raw
In response to Re: Redesigning checkpoint_segments  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Redesigning checkpoint_segments
Re: Redesigning checkpoint_segments
List pgsql-hackers
On 05.06.2013 21:16, Fujii Masao wrote:
> On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com>  wrote:
>> I propose that we do something similar, but not exactly the same. Let's have
>> a setting, max_wal_size, to control the max. disk space reserved for WAL.
>> Once that's reached (or you get close enough, so that there are still some
>> segments left to consume while the checkpoint runs), a checkpoint is
>> triggered.
>
> What if max_wal_size is reached while the checkpoint is running? We should
> change the checkpoint from spread mode to fast mode?

The checkpoint spreading code already tracks if the checkpoint is "on 
schedule", and it takes into account both checkpoint_timeout and 
checkpoint_segments. Ie. if you consume segments faster than expected, 
the checkpoint will speed up as well. Once checkpoint_segments is 
reached, the checkpoint will complete ASAP, with no delays to spread it out.

This would still work the same with max_wal_size. A new checkpoint would 
be started well before reaching max_wal_size, so that it has enough time 
to complete. If the checkpoint "falls behind", it will hurry up until 
it's back on schedule. If max_wal_size is reached anyway, it will 
complete ASAP.

> Or, if max_wal_size
> is hard limit, we should keep the allocation of new WAL file waiting until
> the checkpoint has finished and removed some old WAL files?

I was not thinking of making it a hard limit. It would be just like 
checkpoint_segments from that point of view - if a checkpoint takes a 
long time, max_wal_size might still be exceeded.

>> In this proposal, the number of segments preallocated is controlled
>> separately from max_wal_size, so that you can set max_wal_size high, without
>> actually consuming that much space in normal operation. It's just a
>> backstop, to avoid completely filling the disk, if there's a sudden burst of
>> activity. The number of segments preallocated is auto-tuned, based on the
>> number of segments used in previous checkpoint cycles.
>
> How is wal_keep_segments handled in your approach?

Hmm, haven't thought about that. I think a better unit to set 
wal_keep_segments in would also be MB, not segments. Perhaps 
max_wal_size should include WAL retained for wal_keep_segments, leaving 
less room for checkpoints. Ie. when you you set wal_keep_segments 
higher, a xlog-based checkpoint would be triggered earlier, because the 
old segments kept for replication would leave less room for new 
segments. And setting wal_keep_segments higher than max_wal_size would 
be an error.

- Heikki



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Josh Berkus
Date:
Subject: Re: Configurable location for extension .control files