Thread: Re: autovacuum

Re: autovacuum

From
Chris Browne
Date:
matthew@zeut.net ("Matthew T. O'Connor") writes:
> Legit concern.  However one of the things that autovacuum is supposed to
> do is not vacuum tables that don't need it.  This can result in an overal
> reduction in vacuum overhead.  In addition, if you see that autovacuum is
> firing off vacuum commands during the day and they are impacting your
> response time, then you can play with the vacuum cost delay settings that
> are design to throttle down the IO impact vacuum commands can have.  In
> addition if you use 8.1, you can set per table thresholds, per table
> vacuum cost delay settings, and autovacuum will respect the work done by
> non-autovacuum vacuum commands.  Meaning that if you manually vacuum
> tables at night during a maintenance window, autovacuum will take that
> into account.  Contrib autovacuum couldn't do this.
>
> Hope that helps.  Real world feed-back is always welcome.

I have a question/suggestion...

Something we found useful with Slony-I was the notion of checking the
eldest XID on the system to see if there was any point at all in
bothering to vacuum.  I don't see anything analagous in autovacuum.c;
this might well be a useful addition.

In the Slony-I cleanup thread loop, we collect, in each iteration, the
current earliest XID.

In each iteration of this loop, we check to see if that XID has
changed.

- First time thru, it changes from 0 to 'some value' and so tries to do
  a vacuum.

- But supposing you have some long running transaction (say, a pg_dump
  that runs for 2h), it becomes pretty futile to bother trying to
  vacuum things for the duration of that transaction, because that
  long running transaction will, via MVCC, hold onto any old tuples.

It strikes me as a slick idea for autovacuum to take on that
behaviour.  If the daily backup runs for 2h, then it is quite futile
to bother vacuuming a table multiple times during that 2h period when
none of the tuples obsoleted during the 2h period will be able to be
cleaned out until the end.

Presumably this means that, during that 2h period, pg_autovacuum would
probably only issue ANALYZE statements...
--
let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/languages.html
Rules of  the Evil Overlord #51.  "If one of my  dungeon guards begins
expressing  concern over  the  conditions in  the beautiful  princess'
cell,  I  will immediately  transfer  him  to  a less  people-oriented
position." <http://www.eviloverlord.com/>

Re: autovacuum

From
Alvaro Herrera
Date:
Chris Browne wrote:

> It strikes me as a slick idea for autovacuum to take on that
> behaviour.  If the daily backup runs for 2h, then it is quite futile
> to bother vacuuming a table multiple times during that 2h period when
> none of the tuples obsoleted during the 2h period will be able to be
> cleaned out until the end.

Hmm, yeah, sounds useful.  There's one implementation issue to notice
however, and it's that the autovacuum process dies and restarts for each
iteration, so there's no way for it to remember previous state unless
it's saved somewhere permanent, as the stats info is.

However this seems at least slightly redundant with the "maintenance
window" feature -- you could set a high barrier to vacuum during the
daily backup period instead.  (Anybody up for doing this job?)

--
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"No single strategy is always right (Unless the boss says so)"
                                                  (Larry Wall)

Re: autovacuum

From
"Matthew T. O'Connor"
Date:
Alvaro Herrera wrote:
> Chris Browne wrote:
>
>> It strikes me as a slick idea for autovacuum to take on that
>> behaviour.  If the daily backup runs for 2h, then it is quite futile
>> to bother vacuuming a table multiple times during that 2h period when
>> none of the tuples obsoleted during the 2h period will be able to be
>> cleaned out until the end.
>
> Hmm, yeah, sounds useful.  There's one implementation issue to notice
> however, and it's that the autovacuum process dies and restarts for each
> iteration, so there's no way for it to remember previous state unless
> it's saved somewhere permanent, as the stats info is.
>
> However this seems at least slightly redundant with the "maintenance
> window" feature -- you could set a high barrier to vacuum during the
> daily backup period instead.  (Anybody up for doing this job?)

I can't promise anything, but it's on my list of things to hopefully
find time for in the coming months.  No way I can start it in Feb, but
maybe sometime in March.  Anyone else?


Matt

Re: autovacuum

From
Chris Browne
Date:
alvherre@alvh.no-ip.org (Alvaro Herrera) writes:

> Chris Browne wrote:
>
>> It strikes me as a slick idea for autovacuum to take on that
>> behaviour.  If the daily backup runs for 2h, then it is quite futile
>> to bother vacuuming a table multiple times during that 2h period when
>> none of the tuples obsoleted during the 2h period will be able to be
>> cleaned out until the end.
>
> Hmm, yeah, sounds useful.  There's one implementation issue to notice
> however, and it's that the autovacuum process dies and restarts for each
> iteration, so there's no way for it to remember previous state unless
> it's saved somewhere permanent, as the stats info is.

Hmm.  It restarts repeatedly???  Hmmm...

> However this seems at least slightly redundant with the "maintenance
> window" feature -- you could set a high barrier to vacuum during the
> daily backup period instead.  (Anybody up for doing this job?)

In effect, this would be an alternative to the "window" feature.  You
open the window by starting pg_dump; pg_autovacuum would automatically
notice that as the eldest XID, and stop work until the pg_dump
actually finished.

In a way, it strikes me as more elegant; it would automatically notice
"backup windows," noticing *exact* start and end times...
-- 
select 'cbbrowne' || '@' || 'ntlug.org';
http://www.ntlug.org/~cbbrowne/linuxxian.html
"I'm sorry,  Mr.   Kipling, but you  just  don't know how to   use the
English Language."  -- Editor of the San Francisco Examiner, informing
Rudyard Kipling, who had one  article published in the newspaper, that
he needn't bother submitting a second, 1889


Re: autovacuum

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Hmm, yeah, sounds useful.  There's one implementation issue to notice
> however, and it's that the autovacuum process dies and restarts for each
> iteration, so there's no way for it to remember previous state unless
> it's saved somewhere permanent, as the stats info is.

I think you'd really need to remember the previous oldest XID on a
per-table basis to get full traction out of the idea.  But weren't we
thinking of tracking something isomorphic to this for purposes of
minimizing anti-wraparound VACUUMs?

            regards, tom lane

Re: autovacuum

From
Chris Browne
Date:
tgl@sss.pgh.pa.us (Tom Lane) writes:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> Hmm, yeah, sounds useful.  There's one implementation issue to notice
>> however, and it's that the autovacuum process dies and restarts for each
>> iteration, so there's no way for it to remember previous state unless
>> it's saved somewhere permanent, as the stats info is.
>
> I think you'd really need to remember the previous oldest XID on a
> per-table basis to get full traction out of the idea.  But weren't
> we thinking of tracking something isomorphic to this for purposes of
> minimizing anti-wraparound VACUUMs?

I think I'd like that even better :-).

In the Slony-I case, the tables being vacuumed are ones where the
deletion is taking place within the same thread, so that having one
XID is plenty enough because the only thing that should be touching
the tables is the cleanup thread, which is invoked every 10 minutes.
One XID is enough "protection" for that, as least as a reasonable
approximation.

Tracking just the one eldest XID is still quite likely to be
*reasonably* useful with autovacuum, assuming there isn't a by-table
option.  By-table would be better, though.
-- 
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/sgml.html
"Politics  is not a  bad  profession.  If  you succeed there  are many
rewards, if you disgrace yourself you can always write a book."
-- Ronald Reagan