Thread: Figuring autovacuum settings

Figuring autovacuum settings

From
"Chris Hoover"
Date:
Ok, I'm still working on the possibility of turning autovacuum on.

Question, I have a tables that are multi GB.  When I run the threshold calculations using "normal" values, it seems that my large important tables would almost never be vacuumed.  Here is an example, in my largest database, my most important table has 1,855,970 tuples in it.  However, this is a table with 3 years of data.  It has data loaded into it every day and is key to the majority of our queries.  In the past 16 hours or so I have had the following activity (via pg_stat_user_indexes) - 2960 tuples inserted and 40389 tuples updated, and no tuples deleted.

If I understand the threshold calculations, then I need to run with autovacuum_vacuum_scale_factor set to between .005 and .001.  Does this seem to be to small?

Here is the math:

vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples

vacuum_threshold = 200 + (.001*1,855,970)
vacuum_threshold = 2,056

vacuum_threshold = 200 + (.005*1,855,970)
vacuum_threshold = 9,480

It would seem with the numbers given that I should set it somewhere in this range.  Do you agree?

If you don't agree, where would you recomment setting them.

This is my largest database and is over 110GB.  It has tables with 305,702,000 reltuples down to about 500,000 reltuples (for important tables).

I'm just not sure how to properly set the numbers. 

My servers have databases in the folowing ranges:
Server 1 66 GB to 100 MB
Server 2 117 GB to 1.70 GB.
Server 3 95 GB to 4 GB
Server 4 83 GB to 1.2 GB

While the sizes differ, the ratios should be the same since they are identical databases.

Any suggestions about recommendations for the autovacuum values?  Also, should I set the autovacuum just at the cluster level, or should I also set some of the larger tables with table specific values.

Thanks,

Chris


Re: Figuring autovacuum settings

From
"Jim C. Nasby"
Date:
On Wed, May 17, 2006 at 11:53:44AM -0400, Chris Hoover wrote:
> Ok, I'm still working on the possibility of turning autovacuum on.
>
> Question, I have a tables that are multi GB.  When I run the threshold
> calculations using "normal" values, it seems that my large important tables
> would almost never be vacuumed.  Here is an example, in my largest database,
> my most important table has 1,855,970 tuples in it.  However, this is a
> table with 3 years of data.  It has data loaded into it every day and is key
> to the majority of our queries.  In the past 16 hours or so I have had the
> following activity (via pg_stat_user_indexes) - 2960 tuples inserted and
> 40389 tuples updated, and no tuples deleted.
>
> If I understand the threshold calculations, then I need to run with
> autovacuum_vacuum_scale_factor set to between .005 and .001.  Does this seem
> to be to small?

Well, you have to ask yourself if there's much reason to vacuum a table
when well less than 1% of it has been updated (meaning well less than 1%
will be dead tuples). If you're really paranoid about keeping dead space
to a minimum, and can tolerate running vacuum so frequently, then sure,
set the thresholds that low. Otherwise, I suspect you're being too
agressive here.

Typically, I cut the thresholds in half, so a vacuum would happen when
roughly 20% of your table is dead space. I'll also crontab vacuums on
small tables that need to stay small (ie: queue tables). Though in your
case, that would equate to up to 20GB of dead space, which is a bit
much. So you might want to run closer to 10%, or maybe just run smaller
numbers on your larger tables.

Of course you also have to take into consideration what your maintenance
windows are. It may make the most sense to disable autovacuum on your
big tables and just vacuum them every night or so.

In a nutshell, autovacuum is meant to cover 'normal' use cases; most
people arent running 100+ GB databases, so you're going to have to spend
more time administrating something like that.

> Here is the math:
>
> vacuum threshold = vacuum base threshold + vacuum scale factor * number of
> tuples
>
> vacuum_threshold = 200 + (.001*1,855,970)
> vacuum_threshold = 2,056
>
> vacuum_threshold = 200 + (.005*1,855,970)
> vacuum_threshold = 9,480
>
> It would seem with the numbers given that I should set it somewhere in this
> range.  Do you agree?
>
> If you don't agree, where would you recomment setting them.
>
> This is my largest database and is over 110GB.  It has tables with
> 305,702,000 reltuples down to about 500,000 reltuples (for important
> tables).
>
> I'm just not sure how to properly set the numbers.
>
> My servers have databases in the folowing ranges:
> Server 1 66 GB to 100 MB
> Server 2 117 GB to 1.70 GB.
> Server 3 95 GB to 4 GB
> Server 4 83 GB to 1.2 GB
>
> While the sizes differ, the ratios should be the same since they are
> identical databases.
>
> Any suggestions about recommendations for the autovacuum values?  Also,
> should I set the autovacuum just at the cluster level, or should I also set
> some of the larger tables with table specific values.
>
> Thanks,
>
> Chris

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461