Thread: Vacuuming Questions

Vacuuming Questions

From
John Gardner
Date:
We have two PostgreSQL servers (8.2) running in a cluster.

We have autovacuum switched on on both servers and also we are running
the following as a cron job;

Server 1:
30 0,2,4,6,8,10,12,14,16,18,20,22 * * * /usr/bin/vacuumdb --all --analyze

Server 2:
30 1,3,5,7,9,11,13,15,17,19,21,23 * * * /usr/bin/vacuumdb --all --analyze

So, in other words we stagger the jobs one each server in the cluster.

Now, we're not seeing any problems with performance and we're not seeing
any bloat but I have a couple of questions which I hope some of you can
help me out on.

1) Is using the autovacuum daemon and running vacuumdb from a cron job
overkill?

2)  The vendor of the clustering software suggests that I regularly run
vacuum --all --analyze on a specific table.  Surely, if I'm running
vacuumdb --all --analyze isn't this just a wrapper for running the
vacuum --all --analyze command anyway?  I guess that I'm asking why
would I want to run vacuum --all --analyze on a specific table, when I
regularly run vacuumdb --all --analyze on all databases?

Any help would be grateful.

Thanks

John

Re: Vacuuming Questions

From
"Joshua D. Drake"
Date:
On Wed, 23 Apr 2008 16:27:33 +0100
John Gardner <john.gardner@tagish.co.uk> wrote:

> We have two PostgreSQL servers (8.2) running in a cluster.

Could you be a bit more specific about what you mean by: in a cluster?

> Now, we're not seeing any problems with performance and we're not
> seeing any bloat but I have a couple of questions which I hope some
> of you can help me out on.
>
> 1) Is using the autovacuum daemon and running vacuumdb from a cron
> job overkill?

Not necessarily. Autovacuum can get behind when managing larger tables
that are actively used. This is better in 8.3 because you can have more
than one autovacuum worker.

>
> 2)  The vendor of the clustering software suggests that I regularly
> run vacuum --all --analyze on a specific table.  Surely, if I'm
> running vacuumdb --all --analyze isn't this just a wrapper for
> running the vacuum --all --analyze command anyway?

Well you have stated two different things here.

1. The vendor is stating you should vacuum a specific table
2. You are vacuuming the whole database.

Do you have a specific table that they are talking about? Is it
particularly busy? Say a session table?


> I guess that I'm
> asking why would I want to run vacuum --all --analyze on a specific
> table, when I regularly run vacuumdb --all --analyze on all databases?
>

Yes those two are identical.

Joshua D. Drake


--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



Re: Vacuuming Questions

From
"Scott Marlowe"
Date:
On Wed, Apr 23, 2008 at 9:27 AM, John Gardner <john.gardner@tagish.co.uk> wrote:
> We have two PostgreSQL servers (8.2) running in a cluster.
>
>  We have autovacuum switched on on both servers and also we are running the
> following as a cron job;
>
>  Server 1:
>  30 0,2,4,6,8,10,12,14,16,18,20,22 * * * /usr/bin/vacuumdb --all --analyze

You can change that to

30 0-59/2 * * * ...
and
30 1-59/2 * * *

(I think the second one does the odd minutes.  might need to test it.

>  1) Is using the autovacuum daemon and running vacuumdb from a cron job
> overkill?

Maybe.  It's probably better let autovacuum handle most of the db, and
use a single vacuum analyze on the one really busy table.

It's possible that you're using up too much of your I/O bandwidth
cleaning a bunch of table that don't need it.   As Joshua mentioned,
it's probably a good idea to look at 8.3 due to its ability to run >1
autovacuum thread at a time.

Re: Vacuuming Questions

From
John Gardner
Date:
Joshua D. Drake wrote:
> On Wed, 23 Apr 2008 16:27:33 +0100
> John Gardner <john.gardner@tagish.co.uk> wrote:
>
>> We have two PostgreSQL servers (8.2) running in a cluster.
>
> Could you be a bit more specific about what you mean by: in a cluster?

Well, we're using middleware technology to load balance and cluster the
PostgreSQL backends

>
>> Now, we're not seeing any problems with performance and we're not
>> seeing any bloat but I have a couple of questions which I hope some
>> of you can help me out on.
>>
>> 1) Is using the autovacuum daemon and running vacuumdb from a cron
>> job overkill?
>
> Not necessarily. Autovacuum can get behind when managing larger tables
> that are actively used. This is better in 8.3 because you can have more
> than one autovacuum worker.
>
>> 2)  The vendor of the clustering software suggests that I regularly
>> run vacuum --all --analyze on a specific table.  Surely, if I'm
>> running vacuumdb --all --analyze isn't this just a wrapper for
>> running the vacuum --all --analyze command anyway?
>
> Well you have stated two different things here.
>
> 1. The vendor is stating you should vacuum a specific table
> 2. You are vacuuming the whole database.
>
> Do you have a specific table that they are talking about? Is it
> particularly busy? Say a session table?

OK, well initially, the vendor told us to stagger the vacuumdb command
every couple of hours, but latterly they are telling us to vacuum a
specific table.  The table contains the transactions that are used for
rollback if any particular node in the cluster needs to be restarted.

>
>> I guess that I'm
>> asking why would I want to run vacuum --all --analyze on a specific
>> table, when I regularly run vacuumdb --all --analyze on all databases?
>>

Thanks

John