Re: Track skipped tables during autovacuum and autoanalyze - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: Track skipped tables during autovacuum and autoanalyze
Date
Msg-id 20260325012847.e026ba1860c07288efe3e97d@sraoss.co.jp
Whole thread Raw
In response to Re: Track skipped tables during autovacuum and autoanalyze  (Sami Imseih <samimseih@gmail.com>)
Responses Re: Track skipped tables during autovacuum and autoanalyze
List pgsql-hackers
On Tue, 24 Mar 2026 09:58:48 -0500
Sami Imseih <samimseih@gmail.com> wrote:

Thank you for your comments!

> > The attached patch add the following fields to pg_stat_all_tables:
> >  - last_skipped_autovacuum
> >  - last_skipped_autoanalyze
> >  - skipped_autovacuum_count
> >  - skipped_autoanalyze_count
> >
> > Are there any concerns about exposing this in pg_stat_all_tables, or suggestions
> > for a better approach?
> 
> I am not sure about the timestamp columns. I am not saying they will
> not be useful,
> but I think it will be better to just start with counters for this.
> The way the views get
> used, a dashboard built for tracking the deltas of the counters can easily spot
> when there is a spike of skipped autovacuum/autoanalyze count.
> Also, for tables that are being autovacuumed and skipped quickly,
> the timestamps will just be overwritten.
> 
> So, I am +1 on the counters, -1 on the timestamps.

Although the timestamps are overwritten on each skipped autovacuum or
autoanalyze, they still indicate when the last attempt was made. This
can help users confirm that autovacuum is actively attempting to run,
and that the issue is due to repeated skips rather than inactivity.

While counters can indicate overall activity, they do not reveal when
the last skip occurred. With timestamps, users can immediately see the
most recent attempt, even without a separate dashboard or historical
tracking.

Therefore, counters are useful for monitoring overall activity, but
timestamps give additional, complementary information, so it seems
worthwhile to include them too.
 
> Out of scope for this patch, but I also wonder if we should add another counter,
> autovacuum_started_count. If there are other types of failure scenarios such as
> corrupt indexes, checksum failures, etc. which terminate the
> autovacuum in flight,
> we would be able to catch this by looking at the number of autovacuums
> started vs completed. The skipped counters in this patch and a started
> counter would capture different stages of the autovacuum lifecycle;
> skipped means
> "never started" (lock contention), while a started-minus-completed delta means
> "started but failed." Both are useful signals, but for different reasons.

That makes sense. I was considering adding a counter to track
"canceled" autovacuum or autoanalyze, but tracking the number of started
or attempted autovacuums might provide a more general information than counting
only the failed ones.

> In terms of the patch:
> 
> 1/
> 
> +                       if (AmAutoVacuumWorkerProcess())
> +                               pgstat_report_skipped_vacuum(relid);
> 
> Any reason why this should not also include manual vacuum/analyze?
> If someone has a vacuum/analyze script that uses SKIP_LOCKED, and
> the operation gets skipped, this should be included in the counter.
> this can be done with separate counter fields for autovacuum/autoanalyze and
> vacuum/analyze

For manual vacuum/analyze, an explicit WARNING is output when the
operation is skipped, so I initially thought that reporting it in the
stats view was not necessary. However, I now agree that it should be
included.

> 2/
> 
> +            pg_stat_get_skipped_autovacuum_count(C.oid) AS
> skipped_autovacuum_count,
> 
> How about a name like "autovacuum_lock_skip_count"?

I'm not sure this is a good fit, since it may be better to keep the
naming consistent with other fields whose names end with
"...vacuum_count" or "...analyze_count". Happy to hear other opinions.

Regards,
Yugo Nagata

-- 
Yugo Nagata <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: index prefetching
Next
From: Hannu Krosing
Date:
Subject: Re: SQL Property Graph Queries (SQL/PGQ)