Re: autovauum integration patch: Attempt #4 - Mailing list pgsql-patches

From Matthew T. O'Connor
Subject Re: autovauum integration patch: Attempt #4
Date
Msg-id 1091506020.4901.31.camel@zedora2
Whole thread Raw
In response to Re: autovauum integration patch: Attempt #4  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: autovauum integration patch: Attempt #4
List pgsql-patches
On Mon, 2004-08-02 at 19:26, Tom Lane wrote:
> I looked over this patch (sorry for the delay), and found a number of
> problems.

Thanks for the feedback, hopefully we can still get something in place
for 7.5.

> Bigger problems:
>
> * I don't think you've thought through system shutdown at all.  The
> postmaster code as given will not try to shutdown the autovac daemon
> until the same time it shuts down the bgwriter, which is no good for
> a couple of reasons:

[ snip: lots of good reasons autovac should shutdown before bgwriter ]

I agree.  The thought had crossed my mind that autovac should shut down
first, but I'm really not sure how to make that happen.  I was only able
todo the backend integration at all by cribbing off the bgwriter example
(as it was was suggested I do).  So....

I'll take a stab at this, but some direction would be appreciated.

> * I hadn't quite focused before on the fact that this patch requires
> statically binding frontend libpq into the backend.

[ snip ]

Based on the other part of this thread, I think this is solved by
dynamic linking?  But as I said in the other email, I will need help
making this happen.

> * AFAICS there is no provision for setting pg_user or pg_user_password,
> which means that the daemon won't actually be able to connect in
> non-TRUST environments.  I don't know what we do about this: putting
> such a password in a GUC variable is right out (because any user could
> read it) and putting it on the postmaster command line is no better
> (anyone on the same machine can see it).  Right at the moment we do not
> have any secure place for postmaster parameters.

Right, I didn't want to use GUC vars for this.  As I said in the other
email, I will take a whack at using a new special .autovacpasswd file in
the $PGDATA dir.

BTW, is this really something that needs to get solved?  It could just
be considered a limitation of the 7.5 implementation that it requires
local trust or ident authentication.

> Smaller problems:
>
> * It still contains too much code copied-and-pasted from bgwriter,
> such as
>             ShutdownXLOG(0, 0);
>             DumpFreeSpaceMap(0, 0);
> autovac has *no* business doing that.  I don't have time to go through
> it line-by-line to see what else shouldn't have been copied.

Yeah, as I said above, I cribbed heavily off the bgwriter example.  I
left in most of the things that I was unsure about thinking that it
would be more obvious to remove the code then to readd it.  I'll remove
the above code and try to see if there is anything else that should go
away.

> * The patch reverts some recent changes in postmaster.c (write_stderr
> changes for instance).

I'm sure this is just due to the fact that the patch was submitted prior
to that work going in.  I'll update the patch.

> * Emitting this warning on every single iteration of the postmaster idle
> loop is excessive:
>     elog(WARNING, "pg_autovacuum: autovac is enabled, but requires stats_row_level which is not enabled");
> and this one even more so:
>     if (!autovacuum_start_daemon)
>         elog(DEBUG1, "pg_autovacuum: not enabled");

I left it this way on purpose as it didn't want the admin to miss it,
and I thought they might if it only emits the warning once on postmaster
startup.  I figure that the admin see this while setting up their server
initially and never see it again.   I'm more then happy to change this
if you still think I should.

> * Any message that's actually likely to be seen by users should be an
> ereport, not elog.  elog is okay for debugging messages and can't-happen
> errors, but not for messages that will be seen by ordinary users,
> because there's no support for message translation in elog.

I looked into changing my elog calls to ereport, but I thought it wasn't
necessary since ordinary users won't see these messages, they will only
be in the postmaster log file.

> * I don't think you've actually thought very carefully about elog-based
> error handling; for instance this bit:
>             dbs->conn = db_connect(dbs);
>             if (dbs->conn == NULL)
>             {                    /* Serious problem: We can't connect to
>                                  * template1 */
>                 elog(ERROR, "pg_autovacuum: Cannot connect to template1, exiting.");
>                 exit(1);
>             }
> Control won't make it to the exit(1)  (which is a darn good thing in
> itself, seeing that you are connected to shared memory and hence had
> better be using proc_exit()).  What *will* happen in the code as given
> is that control will bounce back to the sigsetjmp, which will recover
> and re-call AutoVacLoop, which would be okay except you just forgot
> any open backend connections you have (plus leaked all the memory in
> your datastructures).  Maybe it would be better if you did not have
> an error catcher but just aborted the process on any serious error,
> letting the postmaster start a new one.  (This ties into your FIXME
> note about how the postmaster should react to autovac crashes...)

Ok, some of this cruft is related to its roots as a stand-alone app.  I
will fix this too.  So the correct answer here is to skip the
elog/ereport and call proc_exit(), or should I just downgrade the elog
to NOTICE and then call proc_exit()?  Lemme know and I'll fix.

> * autovacuum.h exports way too much stuff that is not relevant to
> other modules.  (Rule #1: you don't declare static functions in a
> header file; these *will* provoke warnings.)

Yeah, someone made these changes to pg_autovacuum.h while it was still
in contrib.  I just assumed that who ever made the changes knew what
they were doing :-)  I'll change them back.

> I'm not sure what we do now.  I can't apply this in its current state,
> and I do not have time to fix it.  I don't really want to push it in
> and assume we can fix the problems during beta ...

I don't know either as it's not my decision to make :-)  As I said
though, I'm willing to make all the fixes I mentioned above if you tell
me there is a decent change it will get applied 7.5.

BTW, despite the simplicity of autovac, lots of people are using the
contrib version and it seems to be helping.  So while it certainly has
room for improvement, I think even in it's current state, it is
considered and important and useful tool by lots of users.

Thanks again for the code review, let me know what I should do, and I'll
get on it ASAP.

Matthew




pgsql-patches by date:

Previous
From: "Matthew T. O'Connor"
Date:
Subject: Re: autovauum integration patch: Attempt #4
Next
From: Zhenbang Wei
Date:
Subject: postgres-zh_TW.po for 7.5