Re: Autovacuum in the backend - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Autovacuum in the backend
Date
Msg-id 20050616035547.GA14519@surnet.cl
Whole thread Raw
In response to Re: Autovacuum in the backend  ("Matthew T. O'Connor" <matthew@zeut.net>)
Responses Re: Autovacuum in the backend
Re: Autovacuum in the backend
List pgsql-hackers
On Wed, Jun 15, 2005 at 11:42:17PM -0400, Matthew T. O'Connor wrote:
> Alvaro Herrera wrote:
> 
> >A question for interested parties.  I'm thinking in handling the
> >user/password issue by reading the flat files (the copies of pg_shadow,
> >pg_database, etc).
> >
> >The only thing that I'd need to modify is add the datdba field to
> >pg_database, so we can figure out an appropiate user for vacuuming each
> >database.
> 
> I probably don't understand all the issue involved here but reading 
> pg_shadow by hand seems problematic.  Do you constantly re-read it?  
> What happens when a new user is added etc....

You don't read the pg_shadow table.  Rather, you read the pg_user file,
which is a plain-text file representing the information in pg_shadow.
It's kept up to date by backends that modify user information.  Likewise
for pg_database and pg_group.

> Can't autovacuum run as a super-user that can vacuum anything?

That'd be another way to do it, maybe simpler.

Currently I'm working on separating this in two parts though, one being
a shlib and other the standard postmaster-launched backend process.  So
I don't have to address this issue right now.  It just bothered me to
need a separate file with username and password, and the corresponding
code to read it.


One issue I do have to deal with right now is how many autovacuum
processes do we want to be running.  The current approach is to have one
autovacuum process.  Two possible options would be to have one per
database, and one per tablespace.  What do people think?

I'm leaning for the simpler option myself but I'd like to hear more
opinions.  Particularly since one-per-database makes the code a lot
simpler as far as I can see, because the shlib only needs to worry about
issuing VACUUM commands; with the other approaches, the shlib has to
manage everything (keep the pg_autovacuum table up to date, figuring out
when vacuums are needed, etc.)

The main problem with the one-per-database is that we wouldn't have a
(simple) way of coordinating vacuums so that they don't compete for I/O.
That's why I thought of the one-per-tablespace approach, though that one
is the most complex of all.

-- 
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Un poeta es un mundo encerrado en un hombre" (Victor Hugo)


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Autovacuum in the backend
Next
From: Gavin Sherry
Date:
Subject: Re: Autovacuum in the backend