Home > mailing lists

Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: Redesigning checkpoint_segments
Date	May 27, 2015 00:26:09
Msg-id	CAMkU=1xxnNJBh_w6SGU-nYszuLKkq3hPyMKk9fadZdtbU2=o9A@mail.gmail.com Whole thread Raw
In response to	Re: Redesigning checkpoint_segments (Fujii Masao <masao.fujii@gmail.com>)
Responses	Re: Redesigning checkpoint_segments
List	pgsql-hackers

Tree view

On Thu, May 21, 2015 at 8:40 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, May 21, 2015 at 3:53 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, Mar 16, 2015 at 11:05 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> On Mon, Feb 23, 2015 at 8:56 AM, Heikki Linnakangas
>> <hlinnakangas@vmware.com> wrote:
>>>
>>>
>>> Everyone seems to be happy with the names and behaviour of the GUCs, so
>>> committed.
>>
>>
>>
>> The docs suggest that max_wal_size will be respected during archive
>> recovery (causing restartpoints and recycling), but I'm not seeing that
>> happening. Is this a doc bug or an implementation bug?
>
>
> I think the old behavior, where restartpoints were driven only by time and
> not by volume, was a misfeature. But not a bug, because it was documented.
>
> One of the points of max_wal_size and its predecessor is to limit how big
> pg_xlog can grow. But running out of disk space on pg_xlog is no more fun
> during archive recovery than it is during normal operations. So why
> shouldn't max_wal_size be active during recovery?

The following message of commit 7181530 explains why.

In standby mode, respect checkpoint_segments in addition to
checkpoint_timeout to trigger restartpoints. We used to deliberately only
do time-based restartpoints, because if checkpoint_segments is small we
would spend time doing restartpoints more often than really necessary.
But now that restartpoints are done in bgwriter, they're not as
disruptive as they used to be. Secondly, because streaming replication
stores the streamed WAL files in pg_xlog, we want to clean it up more
often to avoid running out of disk space when checkpoint_timeout is large
and checkpoint_segments small.

Previously users were more likely to fall into this trouble (i.e., too frequent
occurrence of restartpoints) because the default value of checkpoint_segments
was very small, I guess. But we increased the default of max_wal_size, so now
the risk of that trouble seems to be smaller than before, and maybe we can
allow max_wal_size to trigger restartpoints.

I see. The old behavior was present for the same reason we decided to split

checkpoint_segments into max_wal_size and min_wal_size.

That is, the default checkpoint_segments was small, and it had to be small because increasing it would cause more space to be used even when that extra space was not helpful.

So perhaps we can consider this change a completion of the max_wal_size work, rather than a new feature?

Cheers,

Jeff

pgsql-hackers by date:

From: Oskari Saarenmaa
Date: 27 May 2015, 00:20:07
Subject: Re: hstore_plpython regression test does not work on Python 3

From: Alvaro Herrera
Date: 27 May 2015, 00:41:18
Subject: Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound

Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

Previous

Next