On 02/03/2015 07:50 AM, Robert Haas wrote:
> On Tue, Feb 3, 2015 at 10:44 AM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> That's the whole point of this patch. "max_checkpoint_segments = 10", or
>> "max_checkpoint_segments = 160 MB", means that the system will begin a
>> checkpoint so that when the checkpoint completes, and it truncates away or
>> recycles old WAL, the total size of pg_xlog is 160 MB.
>>
>> That's different from our current checkpoint_segments setting. With
>> checkpoint_segments, the documented formula for calculating the disk usage
>> is (2 + checkpoint_completion_target) * checkpoint_segments * 16 MB. That's
>> a lot less intuitive to set.
>
> Hmm, that's different from what I was thinking. We probably shouldn't
> call that max_checkpoint_segments, then. I got confused and thought
> you were just trying to decouple the number of segments that it takes
> to trigger a checkpoint from the number we keep preallocated.
Wait, what? Because the new setting is an actual soft maximum, we
*shouldn't* call it a maximum? Or are you saying something else?
On 02/03/2015 04:25 AM, Heikki Linnakangas wrote:
> On 02/02/2015 04:21 PM, Andres Freund wrote:
>> I think we need to increase checkpoint_timeout too - that's actually
>> just as important for the default experience from my pov. 5 minutes
>> often just unnecessarily generates FPWs en masse.
I have yet to see any serious benchmarking on checkpoint_timeout. It
does seem that for some workloads on some machines a longer timeout is
better, but I've also seen workloads where a longer timeout decreases
throughput or raises IO. So absent some hard numbers, I'd be opposed to
changing the default.
>>
>>> I'll open the bidding at 1600MB (aka 100).
>>
>> Fine with me.
>
> I wouldn't object to raising it a little bit, but that's way too high.
> It's entirely possible to have a small database that generates a lot of
> WAL. A table that has only a few rows, but is updated very very
> frequently, for example. And checkpointing such a database is quick too,
> so frequent checkpoints are not a problem. You don't want to end up with
> 1.5 GB of WAL on a 100 MB database.
I suggest 192MB instead (12 segments). That almost doubles our current
real default, without requiring huge disk space which might surprise
some users.
In practice, checkpoint_segments is impossible to automatically tune
correctly. So let's be conservative.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com