Re: pg_autovacuum next steps - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: pg_autovacuum next steps
Date
Msg-id Pine.LNX.4.58.0403221929440.3239@linuxworld.com.au
Whole thread Raw
In response to Re: pg_autovacuum next steps  (Gavin Sherry <swm@linuxworld.com.au>)
Responses Re: pg_autovacuum next steps
Re: pg_autovacuum next steps
List pgsql-hackers
> > Ok, thanks for the offer to help, but I think I understated things above
> > when I said I'll need a "little" help :-)
> >
>
> I haven't looked at the code but...
>
> > I have a few big picture questions.  Once pg_autovacuum is launched as a
> > postmaster sub-process, what changes?  That is, currently pg_autovacuum
> > uses libpq to connect to a database and issue queries including a vacuum
> > / analyze command when needed.  After becoming a subprocess will
> > (should) it still use libpq to connect to the databases, I don't think
>
> It could use libpq but most definately shouldn't.
>
> > so, is it even possible to do that?  If not, how will it checkout the
> > stats of all the different databases?  I guess it should fork() a new
> > backend, connect to it somehow, and use it to query the database, but
> > I'm really not sure how this works.
>
> It can interact with the stats collector (seperate backend) in the same
> way that existing backends interact: through a domain socket.
>
> > I'm looking through the backend startup code to see how the stats
> > collector and the bgwriter work since they are probably two semi-close
> > examples of what I'll have to do.  I think checkpoints does something
> > similar in that it issues a checkpoint command.
>
> The vacuum backend will call vacuum() (or something very like it)
> directly. I imagine that when it gets called and on which table will be
> based upon the existing algorithm.

One point is this: vacuum() assumes that you are running in a fully
fledged backend. There'd be a fair bit of work involved in allowing a
single process to call vacuum() against multiple databases. As such, I
think that a vacuum backend for a specific database should be forked upon
the first connect. Also, the backend might like to try and workout if
there are any active backends for its database every so often and if not,
perform a final vacuum (if necessary) and exit, so that we don't have lots
of idle processes sitting around.

Is there a better approach than this?

Gavin


pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: pg_autovacuum next steps
Next
From: Fabien COELHO
Date:
Subject: Re: Syntax error reporting (was Re: [PATCHES] syntax error position