Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Redesigning checkpoint_segments
Date
Msg-id 54A3D5EE.2010808@vmware.com
Whole thread Raw
In response to Re: Redesigning checkpoint_segments  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Redesigning checkpoint_segments  (Venkata Balaji N <nag1010@gmail.com>)
List pgsql-hackers
(reviving an old thread)

On 08/24/2013 12:53 AM, Josh Berkus wrote:
> On 08/23/2013 02:08 PM, Heikki Linnakangas wrote:
>
>> Here's a bigger patch, which does more. It is based on the ideas in the
>> post I started this thread with, with feedback incorporated from the
>> long discussion. With this patch, WAL disk space usage is controlled by
>> two GUCs:
>>
>> min_recycle_wal_size
>> checkpoint_wal_size
>>
> <snip>
>
>> These settings are fairly intuitive for a DBA to tune. You begin by
>> figuring out how much disk space you can afford to spend on WAL, and set
>> checkpoint_wal_size to that (with some safety margin, of course). Then
>> you set checkpoint_timeout based on how long you're willing to wait for
>> recovery to finish. Finally, if you have infrequent batch jobs that need
>> a lot more WAL than the system otherwise needs, you can set
>> min_recycle_wal_size to keep enough WAL preallocated for the spikes.
>
> We'll want to rename them to make it even *more* intuitive.

Sure, I'm all ears.

> But ... do I understand things correctly that checkpoint wouldn't "kick
> in" until you hit checkpoint_wal_size?  If that's the case, isn't real
> disk space usage around 2X checkpoint_wal_size if spread checkpoint is
> set to 0.9?  Or does checkpoint kick in sometime earlier?

It kicks in earlier, so that the checkpoint *completes* just when
checkpoint_wal_size of WAL is used up. So the real disk usage is
checkpoint_wal_size.

There is still an internal variable called CheckPointSegments that
triggers the checkpoint, but it is now derived from checkpoint_wal_size
(see CalculateCheckpointSegments function):

CheckPointSegments = (checkpoint_wal_size / 16 MB) / (2 +
checkpoint_completion_target)

This is the same formula we've always had in the manual for calculating
the amount of WAL space used, but in reverse. I.e. we calculate
CheckPointSegments based on the desired disk space usage, not the other
way round.

> As a note, pgBench would be a terrible test for this patch; we really
> need something which creates uneven traffic.  I'll see if I can devise
> something.

Attached is a rebased version of this patch. Everyone, please try this
out on whatever workloads you have, and let me know:

a) How does the auto-tuning of the number of recycled segments work?
Does pg_xlog reach a steady-state size, or does it fluctuate a lot?

b) Are the two GUCs, checkpoint_wal_size, and min_recycle_wal_size,
intuitive to set?

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
Next
From: Heikki Linnakangas
Date:
Subject: Re: Redesigning checkpoint_segments