Re: autovacuum next steps, take 2 - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: autovacuum next steps, take 2
Date
Msg-id 20070222192305.GM19527@nasby.net
Whole thread Raw
In response to Re: autovacuum next steps, take 2  ("Matthew T. O'Connor" <matthew@zeut.net>)
Responses Re: autovacuum next steps, take 2
List pgsql-hackers
On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote:
> Jim C. Nasby wrote:
> >On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote:
> >  
> >>My Proposal:  If we require admins to identify hot tables tables, then: 
> >>1) Launcher fires-off a worker1 into database X.
> >>2) worker1 deals with "hot" tables first, then regular tables.
> >>3) Launcher continues to launch workers to DB X every autovac naptime. 
> >>4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as 
> >>worker1 did above.  If worker1 is still working in DB X then worker2 
> >>looks for hot tables that are being starved because worker1 got busy. 
> >>If worker2 finds no hot tables that need work, then worker2 exits.
> >>    
> >
> >Rather than required people to manually identify hot tables, what if we
> >just prioritize based on table size? So if a second autovac process hits
> >a specific database, it would find the smallest table in need of
> >vacuuming that it should be able to complete before the next naptime and
> >vacuum that. It could even continue picking tables until it can't find
> >one that it could finish within the naptime. Granted, it would have to
> >make some assumptions about how many pages it would dirty.
> >
> >ISTM that's a lot easier than forcing admins to mark specific tables.
> 
> So the heuristic would be:
> * Launcher fires off workers into a database at a given interval 
> (perhaps configurable?)
> * Each worker works on tables in size order. 
> * If a worker ever catches up to an older worker, then the younger 
> worker exits.
> 
> This sounds simple and workable to me, perhaps we can later modify this 
> to include some max_workers variable so that a worker would only exit if 
> it catches an older worker and there are max_workers currently active.

That would likely result in a number of workers running in one database,
unless you limited how many workers per database. And if you did that,
you wouldn't be addressing the frequently update table problem.

A second vacuum in a database *must* exit after a fairly short time so
that we can go back in and vacuum the important tables again (well or
the 2nd vacuum has to periodically re-evaluate what tables need to be
vacuumed).
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas ADI SD"
Date:
Subject: Re: Column storage positions
Next
From: "Jim C. Nasby"
Date:
Subject: Re: autovacuum next steps, take 2