Monitoring for failed autovacuum - Mailing list pgsql-admin

From John Rouillard
Subject Monitoring for failed autovacuum
Date
Msg-id 20120514200238.GM20157@renesys.com
Whole thread Raw
List pgsql-admin
Hello all:

I am running postgres 8.4.7 as supplied by centos 5.8.  I get a number
of messages like:

  ERROR:  canceling autovacuum task
  CONTEXT:  automatic vacuum of table "template0.pg_catalog.pg_shdepend"

for a number of different tables in my databases. We use slony to do
replication and that does have a tendency to use some tables very
heavily resulting in the autovacuum being cancelled. So I am not
concerned about these until we have continually failed a certain
number of autovacuums.

I have set up monitoring to detect 6 successive failures in a week
without any intervening successful vacuums (as reported with
log_autovacuum_min_duration = 0).

In looking through the logs I was expecting to see many more attempts
to vacuum a table when the autovacuum had been cancelled. It looks
like it's happening every few days rather than retrying a few times a
day.  I have one ongoing correlation that reports cancelled
autovacuums at:

  Thu May 10 20:59:08 2012
  Sun May 13 08:30:59 2012

I assume my next one will arrive Wednesday the 16th or so.

So what I am wondering is why is the autovacuum not being rescheduled
more often? I assume the fact that the autovacuum wasn't completed
should make it more likely to be scheduled, before things start
bloating.

If this reschedule period is correct operation, what should I set my
thresholds to:

  3 cancelled autovacuums/week?
  6 cancelled/2 weeks?

or something else.

Thanks.

--
                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111

pgsql-admin by date:

Previous
From: "ktm@rice.edu"
Date:
Subject: Re: CPU Load question / PgBouncer config
Next
From: Sergey Konoplev
Date:
Subject: Re: dangling connections