Re: Autovacuum integration - Mailing list pgsql-patches

From Alvaro Herrera
Subject Re: Autovacuum integration
Date
Msg-id 20050712233455.GB15464@alvh.no-ip.org
Whole thread Raw
In response to Re: Autovacuum integration  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
On Fri, Jul 08, 2005 at 03:56:25PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Here is a second attempt at autovacuum integration.
>
> A few comments:

Ok, here is an updated patch.  Hopefully I have covered most your
more important observations.  Particularly, I changed the shutdown
sequence per your comments, and the pg_autovacuum tuple is optional.

Additional comments:

> * I see you have an autovac_init function to "annoy the user", but
> shouldn't this be checked every time we are about to spawn an autovac
> process?

I didn't do anything about this (i.e. it only happens once).  Note that
if we annoy the user because of this, the autovacuum process is
disabled "forever."

> * I don't see any special checks for shared catalogs, which means they
> are probably going to be over-vacuumed; or possibly under-vacuumed if
> you fail to track the update stats for them in a single place rather
> than in each database.

I'm still not doing anything special about shared relations.  I think it
would be easy to treat them in a special way.

> * I have no objection to adding extra entry points to vacuum.c to
> simplify the calls to it.

I didn't do it, because it uglified the code.  Rather, I added a relid
member to VacuumStmt.

> If ANALYZE needs to send something to the stats system, make it do
> so.

It does, as does VACUUM.  I still think we should do something special
on TRUNCATE, maybe send a special message.


Note that I keep track of dead tuples directly in the stats for each
table, rather than keeping track of "last vacuum tuples", which was a
strange concept anyway.  It came out being much simpler this way.  The
only consideration is that it makes the vacuum case different from
analyze, but I don't see that as a problem.

Also, there are tables for which analyze refuses to run.  I'm not sure
what to do about them.  The problem is that since ANALYZE doesn't run to
completion, it doesn't emit the stat message, so we try to analyze it
the next time around.  I considered sending special messages to the
stats, telling not to analyze the table in the future (vacuum still
works as expected).  However I don't see how would we re-enable the
auto-analyze feature in case something happens to the table.  There are
two cases: pg_statistics is one, and the other is tables that don't have
any analyzable columns (There is at least one table of this kind in the
regression test, comprising one "box" column.)  This may turn out not to
be a problem, since ANALYZE will return very quickly in this case, but
it annoys me anyway.

Finally: I didn't do anything special about TOAST tables yet.  I think
this is a separate problem.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)

Attachment

pgsql-patches by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: SQLException: Cannot be less than zero
Next
From: Neil Conway
Date:
Subject: Re: Doc patch: New PL/Perl Features