Re: pg_autovacuum bug and feature request - Mailing list pgsql-hackers

From Matthew T. O'Connor
Subject Re: pg_autovacuum bug and feature request
Date
Msg-id 1057597453.5150.18.camel@zeutrh9
Whole thread Raw
In response to pg_autovacuum bug and feature request  (Vincent van Leeuwen <pgsql.spam@vinz.nl>)
List pgsql-hackers
On Fri, 2003-07-04 at 13:40, Vincent van Leeuwen wrote:
> I've been using pg_autovacuum for a couple of weeks now

Glad to hear it.

>  and have noticed one
> weird little bug: sometimes the daemon calculates it used a negative amount of
> time for the last vacuum it did, and waits no time at all before checking if
> it needs to run anything again. Sample output:
> 
> 2411 All DBs checked in: -717533400 usec, will sleep for 30 secs.

Strange, I have never seen this.  I run redhat and have tested with
RH7.3, 8.0, 9.  Christopher Browne has also worked on pg_autovacuum and
I have never heard of this problem from him either. 

I would suggest upgrading to the version that is in cvs and seeing if
it's any better.

> The 30 secs is only because I ran it like this:
> pg_autovacuum -d 2 -s 30 -S 0 -t 250 -T 0.01 -U postgres
> 
> I'm using PostgreSQL 7.3.2 on Debian Linux, kernel 2.4.21-rc3.
> 
> 
> Also, I'd like to see a way to tell pg_autovacuum which tables it should
> monitor. I understand most setups would like to have all tables monitored, but
> on our setup pg_autovacuum is wasting most of it's time (and a fair amount of
> serverload) vacuuming some large tables (several GB's of data, the vacuums
> regularly take half an hour per table or something in the very rough vicinity)
> which doesn't give a large win in performance anyway, while it should be
> focusing it's efforts on a few intensively used small tables, where frequent
> vacuums are a much larger win for performance. I vacuum everything nightly
> anyway, so those large tables can be totally ignored by pg_autovacuum in my
> setup. As you can see from the weird -t and -T parameters I already tried to
> make it favor those smaller tables (which get about the same amount of updates
> as the large tables), but I'm not quite sure I'm doing it the right way.

First issue is that you are using an old version of pg_autovaccum,
please update.  Also many of the command line options have changed, the
threshold settings (-t, -T) have been broken up into independent
settings for separate vacuum and analyze thresholds (-v -V and -a -A).

If your large tables are being vacuumed too often, then your scaling
factor is too small.  The -V option says vacuum this table when the
number of update / inserts / deletes = -T percent of the total tuples in
the table.  So, -V = .01 says vacuum when 1% of the tables has been
updated, so if a table has 100k rows, it will get vacuumed every 1k
updates.  

I tried to address this problem by providing -v and -V.  pg_autovacuum
vacuums when (-v + -V*(num_rows_in_table)) updates occur (See the
README.pg_autovacuum for more details on the calculations).  So I would
set your scaling factor higher.  The default settings in cvs are now -v
= 1000 and -V = 2.0

Currently there is no way to specifically tell pg_autovacuum what tables
to check and which to ignore.  I have considering adding an option of
looking in the current database for a pg_autovacuum table that would
provide a list of tables to check / ignore and allow for custom values
of scaling factors etc... on a per database or table basis, but this is
not in cvs and won't be put in for 7.4.  Hopefully for 7.5 there will be
something integrated into the backend making this whole issue moot.

Good luck with this, and please email if you have any questions /
problems.

Matthew T. O'Connor



pgsql-hackers by date:

Previous
From: "Matthew T. O'Connor"
Date:
Subject: Re: pg_autovacuum bug and feature request
Next
From: ohp@pyrenet.fr
Date:
Subject: Re: pg_stat_activity