Shridhar,
> However I do not agree with this logic entirely. It pegs the next vacuum
> w.r.t current table size which is not always a good thing.
No, I think the logic's fine, it's the numbers which are wrong. We want to
vacuum when updates reach between 5% and 15% of total rows. NOT when
updates reach 110% of total rows ... that's much too late.
Hmmm ... I also think the threshold level needs to be lowered; I guess the
purpose was to prevent continuous re-vacuuuming of small tables?
Unfortunately, in the current implementation, the result is tha small tables
never get vacuumed at all.
So for defaults, I would peg -V at 0.1 and -v at 100, so our default
calculation for a table with 10,000 rows is:
100 + ( 0.1 * 10,000 ) = 1100 rows.
> I would rather vacuum the table at 2000 updates, which is what you probably
> want.
Not necessarily. This would be painful if the table has 10,000,000 rows. It
*should* be based on a % of rows.
> Furthermore analyze threshold depends upon inserts+updates. I think it
> should also depends upon deletes for obvious reasons.
Yes. Vacuum threshold is counting deletes, I hope?
> What did you expected in above example? It is not difficult to tweak
> pg_autovacuum calculations. For testing we can play around.
Can I set the settings to decimals, or are they integers?
> vacthresh = vacbase*vacscale
> anathresh = anabase + anascale*ntuples
Nope, see above.
My comment about the frequency of vacuums vs. analyze is that currently the
*default* is to analyze twice as often as you vacuum. Based on my
experiece as a PG admin on a variety of databases, I believe that the default
should be to analyze half as often as you vacuum.
> I am all for experimentation. If you have real life data to play with, I
> can give you some patches to play around.
I will have real data very soon .....
--
Josh Berkus
Aglio Database Solutions
San Francisco