Re: Vacuum daemon (pgvacuumd ?) - Mailing list pgsql-hackers

From Rod Taylor
Subject Re: Vacuum daemon (pgvacuumd ?)
Date
Msg-id 04e401c1c4bc$f7645cd0$b002000a@jester
Whole thread Raw
In response to Vacuum daemon (pgvacuumd ?)  (mlw <markw@mohawksoft.com>)
List pgsql-hackers
> (for background, see conversation: "Postgresql backend to perform
vacuum
> automatically" )
>
> In the idea phase 1, brainstorm
>
> Create a table for the defaults in template1
> Create a table in each database for state inforation.
>
> Should have a maximum duty cycle for vacuum vs non-vacuum on a per
table basis.
> If a vacuum takes 3 minutes, and a duty cycle is no more than 10%,
the next
> vacuum can not take place for another 30 minutes. Is this a table or
database
> setting? I am thinking table. Anyone have good arguments for
database?

I'd vote for database (or even system) settings personally, as those
tables which don't get updated simply won't have vacuum run on them.
Those that do will.  Vacuum anywhere will degrade performance as it's
additional disk work.  To top that off, if it's a per table duty cycle
you need to add additional checks to prevent vacuum from running on
all or several tables at the same time.  Duty cycle per DB (single
vacuum tracking per db) will limit to a single instance of vacuum.

I'm a little concerned about duty cycle.  Why limit?  If a tables
access speed could be increased enough to outweight the cost of the
vacuum it should always be done.  Perhaps a generic cost > 500 + (15%
tuples updated / deleted) would work.  That is, a %age dead tuples,
plus a base to keep it from constantly firing on nearly empty tables.

Do the table, and pick the next worse off (if there are more than one
requring vacuum).  Perhaps frequency of selects weighs in here too.
15% dead in a table recieving 99% selects is worse than 100% dead in a
table receiving 99% updates as the former will have more long term
affect by doing it now.  Table with updates is probably constantly
putting up requests anyway.

I'd suggest making the base and %age dead tuple numbers GUCable rather
than stored in a system table.  It's probably not something we want
people playing with easily -- especially when they can still run
vacuum manually.


Finally, are the stats your collecting based on completed transactions
or do they include ones that are rolled back as well?  100 updates
rolled back is just as evil as 100 that completed -- speed wise
anyway.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Vacuum daemon (pgvacuumd ?)
Next
From: Bruce Momjian
Date:
Subject: Re: Do we still have locking problems with concurrent users