Josh Berkus wrote:
> Shridhar,
>>However I do not agree with this logic entirely. It pegs the next vacuum
>>w.r.t current table size which is not always a good thing.
>
>
> No, I think the logic's fine, it's the numbers which are wrong. We want to
> vacuum when updates reach between 5% and 15% of total rows. NOT when
> updates reach 110% of total rows ... that's much too late.
Well, looks like thresholds below 1 should be norm rather than exception.
> Hmmm ... I also think the threshold level needs to be lowered; I guess the
> purpose was to prevent continuous re-vacuuuming of small tables?
> Unfortunately, in the current implementation, the result is tha small tables
> never get vacuumed at all.
>
> So for defaults, I would peg -V at 0.1 and -v at 100, so our default
> calculation for a table with 10,000 rows is:
>
> 100 + ( 0.1 * 10,000 ) = 1100 rows.
I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
thresholds less than 1 should be used.
>>Furthermore analyze threshold depends upon inserts+updates. I think it
>>should also depends upon deletes for obvious reasons.
> Yes. Vacuum threshold is counting deletes, I hope?
It does.
> My comment about the frequency of vacuums vs. analyze is that currently the
> *default* is to analyze twice as often as you vacuum. Based on my
> experiece as a PG admin on a variety of databases, I believe that the default
> should be to analyze half as often as you vacuum.
OK.
>>I am all for experimentation. If you have real life data to play with, I
>>can give you some patches to play around.
> I will have real data very soon .....
I will submit a patch that would account deletes in analyze threshold. Since you
want to delay the analyze, I would calculate analyze count as
n=updates + inserts *-* deletes
Rather than current "n = updates + inserts". Also update readme about examples
and analyze frequency.
What does statistics gather BTW? Just number of rows or something else as well?
I think I would put that on Hackers separately.
I am still wary of inverting vacuum analyze frequency. You think it is better to
set inverted default rather than documenting it?
Shridhar