Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Redesigning checkpoint_segments
Date
Msg-id CAMkU=1xxnNJBh_w6SGU-nYszuLKkq3hPyMKk9fadZdtbU2=o9A@mail.gmail.com
Whole thread Raw
In response to Re: Redesigning checkpoint_segments  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Redesigning checkpoint_segments
List pgsql-hackers
On Thu, May 21, 2015 at 8:40 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Thu, May 21, 2015 at 3:53 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, Mar 16, 2015 at 11:05 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> On Mon, Feb 23, 2015 at 8:56 AM, Heikki Linnakangas
>> <hlinnakangas@vmware.com> wrote:
>>>
>>>
>>> Everyone seems to be happy with the names and behaviour of the GUCs, so
>>> committed.
>>
>>
>>
>> The docs suggest that max_wal_size will be respected during archive
>> recovery (causing restartpoints and recycling), but I'm not seeing that
>> happening.  Is this a doc bug or an implementation bug?
>
>
> I think the old behavior, where restartpoints were driven only by time and
> not by volume, was a misfeature.  But not a bug, because it was documented.
>
> One of the points of max_wal_size and its predecessor is to limit how big
> pg_xlog can grow.  But running out of disk space on pg_xlog is no more fun
> during archive recovery than it is during normal operations.  So why
> shouldn't max_wal_size be active during recovery?

The following message of commit 7181530 explains why.

    In standby mode, respect checkpoint_segments in addition to
    checkpoint_timeout to trigger restartpoints. We used to deliberately only
    do time-based restartpoints, because if checkpoint_segments is small we
    would spend time doing restartpoints more often than really necessary.
    But now that restartpoints are done in bgwriter, they're not as
    disruptive as they used to be. Secondly, because streaming replication
    stores the streamed WAL files in pg_xlog, we want to clean it up more
    often to avoid running out of disk space when checkpoint_timeout is large
    and checkpoint_segments small.

Previously users were more likely to fall into this trouble (i.e., too frequent
occurrence of restartpoints) because the default value of checkpoint_segments
was very small, I guess. But we increased the default of max_wal_size, so now
the risk of that trouble seems to be smaller than before, and maybe we can
allow max_wal_size to trigger restartpoints.

I see.  The old behavior was present for the same reason we decided to split
checkpoint_segments into max_wal_size and min_wal_size.  

That is, the default checkpoint_segments was small, and it had to be small because increasing it would cause more space to be used even when that extra space was not helpful.

So perhaps we can consider this change a completion of the max_wal_size work, rather than a new feature?

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Oskari Saarenmaa
Date:
Subject: Re: hstore_plpython regression test does not work on Python 3
Next
From: Alvaro Herrera
Date:
Subject: Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound