Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Redesigning checkpoint_segments |
Date | |
Msg-id | 51AF84F4.4000504@vmware.com Whole thread Raw |
In response to | Re: Redesigning checkpoint_segments (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: Redesigning checkpoint_segments
Re: Redesigning checkpoint_segments |
List | pgsql-hackers |
On 05.06.2013 21:16, Fujii Masao wrote: > On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> I propose that we do something similar, but not exactly the same. Let's have >> a setting, max_wal_size, to control the max. disk space reserved for WAL. >> Once that's reached (or you get close enough, so that there are still some >> segments left to consume while the checkpoint runs), a checkpoint is >> triggered. > > What if max_wal_size is reached while the checkpoint is running? We should > change the checkpoint from spread mode to fast mode? The checkpoint spreading code already tracks if the checkpoint is "on schedule", and it takes into account both checkpoint_timeout and checkpoint_segments. Ie. if you consume segments faster than expected, the checkpoint will speed up as well. Once checkpoint_segments is reached, the checkpoint will complete ASAP, with no delays to spread it out. This would still work the same with max_wal_size. A new checkpoint would be started well before reaching max_wal_size, so that it has enough time to complete. If the checkpoint "falls behind", it will hurry up until it's back on schedule. If max_wal_size is reached anyway, it will complete ASAP. > Or, if max_wal_size > is hard limit, we should keep the allocation of new WAL file waiting until > the checkpoint has finished and removed some old WAL files? I was not thinking of making it a hard limit. It would be just like checkpoint_segments from that point of view - if a checkpoint takes a long time, max_wal_size might still be exceeded. >> In this proposal, the number of segments preallocated is controlled >> separately from max_wal_size, so that you can set max_wal_size high, without >> actually consuming that much space in normal operation. It's just a >> backstop, to avoid completely filling the disk, if there's a sudden burst of >> activity. The number of segments preallocated is auto-tuned, based on the >> number of segments used in previous checkpoint cycles. > > How is wal_keep_segments handled in your approach? Hmm, haven't thought about that. I think a better unit to set wal_keep_segments in would also be MB, not segments. Perhaps max_wal_size should include WAL retained for wal_keep_segments, leaving less room for checkpoints. Ie. when you you set wal_keep_segments higher, a xlog-based checkpoint would be triggered earlier, because the old segments kept for replication would leave less room for new segments. And setting wal_keep_segments higher than max_wal_size would be an error. - Heikki
pgsql-hackers by date: