Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower. - Mailing list pgsql-bugs

From Jeff Janes
Subject Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date
Msg-id CAMkU=1x9MEMuTuQdGgUCV=XDFhMhvB1=MjUNEDO4J13ctmpD6w@mail.gmail.com
Whole thread Raw
In response to Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.  (David Gould <daveg@sonic.net>)
List pgsql-bugs
On Fri, Oct 30, 2015 at 9:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> Good point ... shouldn't we have already checked the stats before ever
>>> deciding to try to claim the table?
>
>> The second check is there to allow for some other worker (or manual
>> vacuum) having vacuumed it after we first checked, but which had
>> finished before we check the array of current jobs.
>
> I wonder whether that check costs more than it saves.

A single autovacuum worker can run for hours or days.  I don't think
we should start vacuuming a TB size table because it needed vacuuming
days ago, when the initial to-do list was built up, but no longer
does.  So some kind of recheck is needed.

I thought of making the recheck first use whichever snapshot we
currently have hanging around, and then only if it still needs
vacuuming force a fresh snapshot and re-re-check.  The problem with
that is that any previous snapshot is aggressively destroyed at the
end of each vacuum or analyze by the EOXact code.  So we don't
actually have a snapshot hanging around to use until we go to the work
of re-parsing the database stats file.  So you have to take special
steps to exempt AutoVacuumWorker from EOXact code clearing out the
stats, and I don't know what the repercussions of that might be.

We could also relax the freshness requirements of even the final
re-check, perhaps dynamically.  No point in re-parsing a 40MB stats
file to avoid unnecessary vacuuming a 16KB table.  But parsing a 8KB
stats files to avoid unnecessary vacuuming of a 1TB table is well
worth it.   But that runs into the same problem as above.  Once you
have destroyed your previous stats snapshot, you no longer have the
ability to accept stale stats any longer.

Cheers,

Jeff

pgsql-bugs by date:

Previous
From: Jeff Janes
Date:
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Next
From: David Gould
Date:
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.