Re: autovacuum scheduling starvation and frenzy - Mailing list pgsql-hackers

From Robert Haas
Subject Re: autovacuum scheduling starvation and frenzy
Date
Msg-id CA+TgmoYz=skw1A+AwKFnsMMRVePE_Go+ceY6d+WPqtni7_iQmQ@mail.gmail.com
Whole thread Raw
In response to Re: autovacuum scheduling starvation and frenzy  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: autovacuum scheduling starvation and frenzy
List pgsql-hackers
On Tue, Sep 30, 2014 at 5:59 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Jeff Janes wrote:
>> > I think that instead of
>> > trying to get a single target database in that foreach loop, we could
>> > try to build a prioritized list (in-wraparound-danger first, then
>> > in-multixid-wraparound danger, then the one with the oldest autovac time
>> > of all the ones that remain); then recheck the wrap-around condition by
>> > seeing whether there are other workers in that database that started
>> > after the wraparound condition appeared.
>>
>> I think we would want to check for one worker that is still running, and at
>> least one other worker that started and completed since the wraparound
>> threshold was exceeded.  If there are multiple tables in the database that
>> need full scanning, it would make sense to have multiple workers.  But if a
>> worker already started and finished without increasing the frozenxid and,
>> another attempt probably won't accomplish much either.  But I have no idea
>> how to do that bookkeeping, or how much of an improvement it would be over
>> something simpler.
>
> How about something like this:
>
> * if autovacuum is disabled, then don't check these conditions; the only
> reason we're in do_start_worker() in that case is that somebody
> signalled postmaster that some database needs a for-wraparound emergency
> vacuum.
>
> * if autovacuum is on, and the database was processed less than
> autovac_naptime/2 ago, and there are no workers running in that database
> now, then ignore the database.
>
> Otherwise, consider it for xid-wraparound vacuuming.  So if we launched
> a worker recently, but it already finished, we would start another one.
> (If the worker finished, the database should not be in need of a
> for-wraparound vacuum again, so this seems sensible).  Also, we give
> priority to a database in danger sooner than the full autovac_naptime
> period; not immediately after the previous worker started, which should
> give room for other databases to be processed.
>
> The attached patch implements that.  I only tested it on HEAD, but
> AFAICS it applies cleanly to 9.4 and 9.3; fairly sure it won't apply to
> 9.2.  Given the lack of complaints, I'm unsure about backpatching
> further back than 9.3 anyway.

This kind of seems like throwing darts at the wall.  It could be
better if we are right to skip the database already being vacuumed for
wraparound, or worse if we're not.

I'm not sure that we should do this at all, or at least not without
testing it extensively first.  We could easily shoot ourselves in the
foot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Per table autovacuum vacuum cost limit behaviour strange
Next
From: Merlin Moncure
Date:
Subject: Re: [v9.5] Custom Plan API