Re: We probably need autovacuum_max_wraparound_workers - Mailing list pgsql-hackers

From Robert Haas
Subject Re: We probably need autovacuum_max_wraparound_workers
Date
Msg-id CA+Tgmoa-W71Gz4wqp3DDOv_qEJd59BrtFs1ASN30bJc1ynPYPA@mail.gmail.com
Whole thread Raw
In response to Re: We probably need autovacuum_max_wraparound_workers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: We probably need autovacuum_max_wraparound_workers
List pgsql-hackers
On Wed, Jun 27, 2012 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Josh Berkus (josh@agliodbs.com) wrote:
>>> Yeah, I can't believe I'm calling for *yet another* configuration
>>> variable either.  Suggested workaround fixes very welcome.
>
>> As I suggested on IRC, my thought would be to have a goal-based system
>> for autovacuum which is similar to our goal-based commit system.  We
>> don't need autovacuum sucking up all the I/O in the box, nor should we
>> ask the users to manage that.  Instead, let's decide when the autovacuum
>> on a given table needs to finish and then plan to keep on working at a
>> rate that'll allow us to get done well in advance of that deadline.
>
> If we allow individual vacuum operations to stretch out just because
> they don't need to be completed right away, we will need more concurrent
> vacuum workers (so that we can respond to vacuum requirements for other
> tables).  So I submit that this would only move the problem around:
> the number of active workers would increase to the point where things
> are just as painful, plus or minus a bit.
>
> The intent of the autovacuum cost delay features is to ensure that
> autovacuum doesn't suck an untenable fraction of the machine's I/O
> capacity, even when it's running flat out.  So I think Josh's complaint
> indicates that we have a problem with cost-delay tuning; hard to tell
> what exactly without more info.  It might only be that the defaults
> are bad for these particular users, or it could be more involved.

I've certainly come across many reports of the cost delay settings
being difficult to tune, both on pgsql-hackers/performance and in
various private EnterpriseDB correspondence.  I think Stephen's got it
exactly right: the system needs to figure out the rate at which vacuum
needs to happen, not rely on the user to provide that information.

For checkpoints, we estimated the percentage of the checkpoint that
ought to be completed and the percentage that actually is completed;
if the latter is less than the former, we speed things up until we're
back on track.  For autovacuum, the trick is to speed things up when
the rate at which tables are coming due for autovacuum exceeds the
rate at which we are vacuuming them; or, when we anticipate that a
whole bunch of wraparound vacuums are going to come due
simultaneously, to start doing them sooner so that they are more
spread out.

For example, suppose that 26 tables each of which is 4GB in size are
going to simultaneously come due for an anti-wraparound vacuum in 26
hours.  For the sake of simplicity suppose that each will take 1 hour
to vacuum.  What we currently do is wait for 26 hours and then start
vacuuming them all at top speed, thrashing the I/O system.  What we
ought to do is start vacuuming them much sooner and do them
consecutively.  Of course, the trick is to design a mechanism that
does something intelligent if we think we're on track and then all of
a sudden the rate of XID consumption changes dramatically, and now
we've got vacuum faster or with more workers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Regarding WAL Format Changes
Next
From: Tom Lane
Date:
Subject: Re: We probably need autovacuum_max_wraparound_workers