Home > mailing lists

Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Redesigning checkpoint_segments
Date	June 5, 2013 21:35:49
Msg-id	51AF84F4.4000504@vmware.com Whole thread Raw
In response to	Re: Redesigning checkpoint_segments (Fujii Masao <masao.fujii@gmail.com>)
Responses	Re: Redesigning checkpoint_segments Re: Redesigning checkpoint_segments
List	pgsql-hackers

Tree view

On 05.06.2013 21:16, Fujii Masao wrote:
> On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com>  wrote:
>> I propose that we do something similar, but not exactly the same. Let's have
>> a setting, max_wal_size, to control the max. disk space reserved for WAL.
>> Once that's reached (or you get close enough, so that there are still some
>> segments left to consume while the checkpoint runs), a checkpoint is
>> triggered.
>
> What if max_wal_size is reached while the checkpoint is running? We should
> change the checkpoint from spread mode to fast mode?

The checkpoint spreading code already tracks if the checkpoint is "on 
schedule", and it takes into account both checkpoint_timeout and 
checkpoint_segments. Ie. if you consume segments faster than expected, 
the checkpoint will speed up as well. Once checkpoint_segments is 
reached, the checkpoint will complete ASAP, with no delays to spread it out.

This would still work the same with max_wal_size. A new checkpoint would 
be started well before reaching max_wal_size, so that it has enough time 
to complete. If the checkpoint "falls behind", it will hurry up until 
it's back on schedule. If max_wal_size is reached anyway, it will 
complete ASAP.

> Or, if max_wal_size
> is hard limit, we should keep the allocation of new WAL file waiting until
> the checkpoint has finished and removed some old WAL files?

I was not thinking of making it a hard limit. It would be just like 
checkpoint_segments from that point of view - if a checkpoint takes a 
long time, max_wal_size might still be exceeded.

>> In this proposal, the number of segments preallocated is controlled
>> separately from max_wal_size, so that you can set max_wal_size high, without
>> actually consuming that much space in normal operation. It's just a
>> backstop, to avoid completely filling the disk, if there's a sudden burst of
>> activity. The number of segments preallocated is auto-tuned, based on the
>> number of segments used in previous checkpoint cycles.
>
> How is wal_keep_segments handled in your approach?

Hmm, haven't thought about that. I think a better unit to set 
wal_keep_segments in would also be MB, not segments. Perhaps 
max_wal_size should include WAL retained for wal_keep_segments, leaving 
less room for checkpoints. Ie. when you you set wal_keep_segments 
higher, a xlog-based checkpoint would be triggered earlier, because the 
old segments kept for replication would leave less room for new 
segments. And setting wal_keep_segments higher than max_wal_size would 
be an error.

- Heikki

pgsql-hackers by date:

From: Fujii Masao
Date: 05 June 2013, 21:16:15
Subject: Re: Redesigning checkpoint_segments

From: Josh Berkus
Date: 05 June 2013, 22:08:09
Subject: Re: Configurable location for extension .control files

Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

Previous

Next