Re: autovacuum next steps, take 2 - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: autovacuum next steps, take 2
Date
Msg-id 20070227161831.GC29041@nasby.net
Whole thread Raw
In response to Re: autovacuum next steps, take 2  ("Matthew T. O'Connor" <matthew@zeut.net>)
List pgsql-hackers
On Tue, Feb 27, 2007 at 01:26:00AM -0500, Matthew T. O'Connor wrote:
> Tom Lane wrote:
> >"Matthew T. O'Connor" <matthew@zeut.net> writes:
> >>I'm not sure what you are saying here, are you now saying that partial 
> >>vacuum won't work for autovac?  Or are you saying that saving state as 
> >>Jim is describing above won't work?
> >
> >I'm saying that I don't like the idea of trying to "stop on a dime" by
> >saving the current contents of vacuum's dead-TID array to disk with the
> >idea that we can trust those values 100% later.  Saving the array is
> >expensive both in runtime and code complexity, and I don't believe we
> >can trust it later --- at least not without even more expensive-and-
> >complex measures, such as WAL-logging every such save :-(
> >
> >I'm for stopping only after completing an index-cleaning pass, at the
> >point where we empty the dead-TID array anyway.  If you really have to
> >have "stop on a dime", just kill -INT the process, accepting that you
> >will have to redo your heap scan since the last restart point.
> 
> OK, so if I understand correct, a vacuum of a table with 10 indexes on 
> it can be interrupted 10 times, once after each index-cleaning pass? 
> That might have some value, especially breaking up the work required to 
> vacuum a large table. Or am I still not getting it?

It'd stop after scanning the heap, scanning all the indexes, and then
cleaning the heap. After that's done it no longer needs any of the
dead-TID info; anytime before that it does need that info, and Tom's
objection is that trying to store that info is a bad idea.

The problem with this is that typically it takes a long time to go
through a complete vacuum cycle; minutes at least, and preferably
longer. Decreasing that cycle time will greatly increase the amount of
IO required for vacuuming a table with any indexes, because every time
you cycle through you have to read the entire index. That's why I don't
see it as being useful at all for getting autovac to work on hot tables
- if you actually got that cycle time low enough you'd kill the system
with all the extra index scanning.
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Packed short varlenas, what next?
Next
From: "Jim C. Nasby"
Date:
Subject: Re: autovacuum next steps, take 2