Re: Autovacuum in the backend - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Autovacuum in the backend
Date
Msg-id 20050616150833.GB16044@surnet.cl
Whole thread Raw
In response to Re: Autovacuum in the backend  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Autovacuum in the backend
Re: Autovacuum in the backend
List pgsql-hackers
On Thu, Jun 16, 2005 at 01:32:16AM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@surnet.cl> writes:
> > A question for interested parties.  I'm thinking in handling the
> > user/password issue by reading the flat files (the copies of pg_shadow,
> > pg_database, etc).
> 
> Er, what "user/password issue"?  Context please.
> 
> > The only thing that I'd need to modify is add the datdba field to
> > pg_database, so we can figure out an appropiate user for vacuuming each
> > database.
> 
> The datdba is not necessarily a superuser, and therefore is absolutely
> not the right answer for any question related to autovacuum.  But in
> any case, I would expect that an integrated-into-the-backend autovac
> implementation would be operating at a level below any permission checks
> --- so this question shouldn't be relevant anyway.

Ok, seems things are quite a bit out of context.  What I did was take
Matthew's patch for integrating contrib pg_autovacuum into the
postmaster.  This patch was posted several times as of July and August
2004.  This patch had several issues, like an incorrect shutdown
sequence, forcing libpq to be statically linked into the backend, not
correctly using ereport(), not using the backend's memory management
infrastructure.

There were several suggestions.  One was to separate it in two parts,
one which would be a process launched by postmaster, and another which
would be a shared library, loaded by that other process, which would in
turn load libpq and issue SQL queries (including but not limited to
VACUUM and ANALYZE queries) to a regular backend, using a regular
connection.

Now, the user/password issue is which user and password combination is
used to connect to the regular backend.  Matthew had created a password
file, to be used in a similar fashion to libpq's password file.  This
works but has the drawback that the user has to set the file correctly.
What I'm proposing is using the flatfiles for this.


Now, I'm hearing people don't like using libpq.  This means the whole
thing turn a lot more complicated; for one thing, because it will need
to "connect" to every database in some fashion.  Also, you want it to
"skip" normal permission checks, which would be doable only if it's not
using libpq.  On the other hand, if there were multiple autovacuum
processes, one per database, it'd be all much easier, without using
libpq.

Could we clarify what scenario is people envisioning?  I don't want to
waste time fixing code that in the end is going to be declared as
fundamentally flawed -- I'd rather work on shared dependencies.

Some people say "keep it simple and have one process per cluster."  I
think they don't realize it's actually more complex, not the other way
around.  The only additional complexity is how to handle concurrent
vacuuming, but the code turns out to be simpler because we have access
to system catalogs and standard backend infrastructure in a simple
fashion.



A wholly separate approach is what should the autovacuum daemon be
doing.  At present we only have "full vacuum", "vacuum" and "analyze".
In the future this can be extended and autovacuum can launch partial
vacuums, nappy vacuums, bitmapped vacuums, coffee-with-cream vacuums.
But we need to start somewhere.

-- 
Alvaro Herrera (<alvherre[a]surnet.cl>)
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)


pgsql-hackers by date:

Previous
From: Andreas Pflug
Date:
Subject: Re: Autovacuum in the backend
Next
From: Alvaro Herrera
Date:
Subject: Re: Autovacuum in the backend