Re: Redesigning checkpoint_segments - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Redesigning checkpoint_segments
Date
Msg-id 5217D9DA.5040004@agliodbs.com
Whole thread Raw
In response to Redesigning checkpoint_segments  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Redesigning checkpoint_segments  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 08/23/2013 02:08 PM, Heikki Linnakangas wrote:

> Here's a bigger patch, which does more. It is based on the ideas in the
> post I started this thread with, with feedback incorporated from the
> long discussion. With this patch, WAL disk space usage is controlled by
> two GUCs:
> 
> min_recycle_wal_size
> checkpoint_wal_size
> 
<snip>

> These settings are fairly intuitive for a DBA to tune. You begin by
> figuring out how much disk space you can afford to spend on WAL, and set
> checkpoint_wal_size to that (with some safety margin, of course). Then
> you set checkpoint_timeout based on how long you're willing to wait for
> recovery to finish. Finally, if you have infrequent batch jobs that need
> a lot more WAL than the system otherwise needs, you can set
> min_recycle_wal_size to keep enough WAL preallocated for the spikes.

We'll want to rename them to make it even *more* intuitive.

But ... do I understand things correctly that checkpoint wouldn't "kick
in" until you hit checkpoint_wal_size?  If that's the case, isn't real
disk space usage around 2X checkpoint_wal_size if spread checkpoint is
set to 0.9?  Or does checkpoint kick in sometime earlier?

> except that it's more
> intuitive to set it in terms of "MB of WAL space required", instead of
> "# of segments between checkpoints".

Yes, it certainly is.  We'll need to caution people that fractions of
16MB will be ignored.

> Does that make sense? I'd love to hear feedback on how people setting up
> production databases would like to tune these things. The reason for the
> auto-tuning between the min and max is to be able to set reasonable
> defaults e.g for embedded systems that don't have a DBA to do tuning.
> Currently, it's very difficult to come up with a reasonable default
> value for checkpoint_segments which would work well for a wide range of
> systems. The PostgreSQL default of 3 is way way too low for most
> systems. On the other hand, if you set it to, say, 20, that's a lot of
> wasted space for a small database that's not updated much. With this
> patch, you can set "max_wal_size=1GB" and if the database ends up
> actually only needing 100 MB of WAL, it will only use that much and not
> waste 900 MB for useless preallocated WAL files.

This sounds good, aside from the potential 2X issue I mention above.

Mind you, what admins really want is a hard limit on WAL size, so that
they can create a partition and not worry about PG running out of WAL
space.  But ...

> Making it a hard limit is a much bigger task than I'm willing to tackle
> right now.

... agreed.  And this approach could be built on for a hard limit later on.

As a note, pgBench would be a terrible test for this patch; we really
need something which creates uneven traffic.  I'll see if I can devise
something.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Tom Lane
Date:
Subject: Re: Performance problem in PLPgSQL