Thread: pg_autovacuum bug and feature request

pg_autovacuum bug and feature request

From
Vincent van Leeuwen
Date:
Hi,

I've been using pg_autovacuum for a couple of weeks now, and have noticed one
weird little bug: sometimes the daemon calculates it used a negative amount of
time for the last vacuum it did, and waits no time at all before checking if
it needs to run anything again. Sample output:

2411 All DBs checked in: -717533400 usec, will sleep for 30 secs.

The 30 secs is only because I ran it like this:
pg_autovacuum -d 2 -s 30 -S 0 -t 250 -T 0.01 -U postgres

I'm using PostgreSQL 7.3.2 on Debian Linux, kernel 2.4.21-rc3.


Also, I'd like to see a way to tell pg_autovacuum which tables it should
monitor. I understand most setups would like to have all tables monitored, but
on our setup pg_autovacuum is wasting most of it's time (and a fair amount of
serverload) vacuuming some large tables (several GB's of data, the vacuums
regularly take half an hour per table or something in the very rough vicinity)
which doesn't give a large win in performance anyway, while it should be
focusing it's efforts on a few intensively used small tables, where frequent
vacuums are a much larger win for performance. I vacuum everything nightly
anyway, so those large tables can be totally ignored by pg_autovacuum in my
setup. As you can see from the weird -t and -T parameters I already tried to
make it favor those smaller tables (which get about the same amount of updates
as the large tables), but I'm not quite sure I'm doing it the right way.


Regards,

Vincent van Leeuwen
Media Design - http://www.mediadesign.nl/


Re: pg_autovacuum bug and feature request

From
"Matthew T. O'Connor"
Date:
Sorry for the slow response, I was away for the 4th.

On Fri, 2003-07-04 at 14:53, Christopher Browne wrote:
> Vincent Van Leeuwen wrote:
> > 2411 All DBs checked in: -717533400 usec, will sleep for 30 secs.
> 
> That sure looks like a 32 bit wraparound bug; 

Agreed, please check with pg_autovaccum from CVS and let us know if you
still see this problem.

> > Also, I'd like to see a way to tell pg_autovacuum which tables it
> > should monitor. I understand most setups would like to have all tables
> > monitored, but on our setup pg_autovacuum is wasting most of it's time
> > (and a fair amount of serverload) vacuuming some large tables 

See my other email about your choice of settings.  I believe that you
need to tweak your scaling factor settings to get pg_autovacuum to
behave in the manner you are looking for.  In fact I think the default
values used by pg_autovacuum (CVS version) will work for you.  If not,
please let me know.  In most instances -V < .5 will probably have a net
loss of performance, but it's there for testing purposes.

> The whole point of the architecture of pg_autovacuum is for it to be
> totally unnecessary to give any indication of which tables are to be
> vacuumed, and for the daemon to come up with reasonable answers all by
> itself.

Agreed this was a design principle for pg_autovacuum, and I think it
will get reasonable answers with reasonable settings.

> The FIRST approach I'd take would be to see if there are "tweaks" that
> might be made to the model it uses to determine when to vacuum.  Perhaps
> the formula should take account of table size, and thus vacuum less
> often for larger tables, 

It already does this.  That is what the scaling factor -V does, it is a
percentage of the table size that is added to -v (a base value) to
determine the threshold for vacuums.

> perhaps throwing in a "[-z] Table Pages
> Factor", where the calculation of "how often" would get added into it
> the value:
> 
>    pg_class.relpages * table_pages_factor
> 
> [Jan, Matthew; if you have thoughts on this, feel free to suggest
> further.]

There are lots of things we could try, but the next step I am planning
on taking, is to use the FSM as a guide to what should be vacuumed
when.  It has several benefits over the current setup.  The only thing I
don't know is if we can use it alone, or if we will still need to
monitor other sources of information such as the stats system.

> Of course, there is always the answer:  "Use the Source, Luke!"
> 
> The "local kludge" would be for you to customize pg_autovacuum to
> exclude your "not favorite" tables.  That oughtn't be too difficult to
> do, actually.  If you have several tables you don't want to deal with,
> you could do something like the following:
> 
> 
>   if ((strcmp(tbl->table_name, "table_i_dont_want")) ||
>       (strcmp(tbl->table_name, "another_table_i_dont_want")) ||
>       (strcmp(tbl->table_name, "still_another_table_i_dont_want"))) {
>     /* do nothing */
>   } else {
>     /* proceed with usual logic */
>     if ((tbl->curr_vacuum_count - tbl->CountAtLastVacuum) >=
>         tbl->vacuum_threshold)
> ...

A kludge for sure, but it is open source so...

> To provide the answer that you asked for, in a more general way, would
> require introducing either a data file parser (which is what made pgavd
> a serious pain to deploy) or that pg_autovacuum set up its own
> PostgreSQL tables and store data in them.  

I think a better setup would be to have the config information in the
database itself, or add a new system table that allows it.  A new system
table would require that pg_autovacuum be accecpted as a core component
of postgresql.  I don't see this happening, not as long as it's a libpq
based client app.  Using a table inside of a database makes it easy for
the settings to be tweaked live and reduces complexity.

> There _is_ merit to that; one present shortcoming of pg_autovacuum is
> that it can only talk to one postmaster.  If one were to, for instance,
> have _four_ backends (with 4 separate port numbers) on one server, you
> need four instances of pg_autovacuum, and they would be perfectly happy
> to trample the I/O bus if they each concurrently figure they need to
> vacuum some big tables in the respective instances.

Agreed, this was never built into the design, nor do I think it should
be.




Re: pg_autovacuum bug and feature request

From
"Matthew T. O'Connor"
Date:
On Fri, 2003-07-04 at 13:40, Vincent van Leeuwen wrote:
> I've been using pg_autovacuum for a couple of weeks now

Glad to hear it.

>  and have noticed one
> weird little bug: sometimes the daemon calculates it used a negative amount of
> time for the last vacuum it did, and waits no time at all before checking if
> it needs to run anything again. Sample output:
> 
> 2411 All DBs checked in: -717533400 usec, will sleep for 30 secs.

Strange, I have never seen this.  I run redhat and have tested with
RH7.3, 8.0, 9.  Christopher Browne has also worked on pg_autovacuum and
I have never heard of this problem from him either. 

I would suggest upgrading to the version that is in cvs and seeing if
it's any better.

> The 30 secs is only because I ran it like this:
> pg_autovacuum -d 2 -s 30 -S 0 -t 250 -T 0.01 -U postgres
> 
> I'm using PostgreSQL 7.3.2 on Debian Linux, kernel 2.4.21-rc3.
> 
> 
> Also, I'd like to see a way to tell pg_autovacuum which tables it should
> monitor. I understand most setups would like to have all tables monitored, but
> on our setup pg_autovacuum is wasting most of it's time (and a fair amount of
> serverload) vacuuming some large tables (several GB's of data, the vacuums
> regularly take half an hour per table or something in the very rough vicinity)
> which doesn't give a large win in performance anyway, while it should be
> focusing it's efforts on a few intensively used small tables, where frequent
> vacuums are a much larger win for performance. I vacuum everything nightly
> anyway, so those large tables can be totally ignored by pg_autovacuum in my
> setup. As you can see from the weird -t and -T parameters I already tried to
> make it favor those smaller tables (which get about the same amount of updates
> as the large tables), but I'm not quite sure I'm doing it the right way.

First issue is that you are using an old version of pg_autovaccum,
please update.  Also many of the command line options have changed, the
threshold settings (-t, -T) have been broken up into independent
settings for separate vacuum and analyze thresholds (-v -V and -a -A).

If your large tables are being vacuumed too often, then your scaling
factor is too small.  The -V option says vacuum this table when the
number of update / inserts / deletes = -T percent of the total tuples in
the table.  So, -V = .01 says vacuum when 1% of the tables has been
updated, so if a table has 100k rows, it will get vacuumed every 1k
updates.  

I tried to address this problem by providing -v and -V.  pg_autovacuum
vacuums when (-v + -V*(num_rows_in_table)) updates occur (See the
README.pg_autovacuum for more details on the calculations).  So I would
set your scaling factor higher.  The default settings in cvs are now -v
= 1000 and -V = 2.0

Currently there is no way to specifically tell pg_autovacuum what tables
to check and which to ignore.  I have considering adding an option of
looking in the current database for a pg_autovacuum table that would
provide a list of tables to check / ignore and allow for custom values
of scaling factors etc... on a per database or table basis, but this is
not in cvs and won't be put in for 7.4.  Hopefully for 7.5 there will be
something integrated into the backend making this whole issue moot.

Good luck with this, and please email if you have any questions /
problems.

Matthew T. O'Connor