On Tue, Oct 28, 2025 at 12:16:28PM +1300, David Rowley wrote:
> I think it's reasonable to want to document how autovacuum prioritises
> tables, but maybe not in too much detail. Longer term, I think it
> would be good to have a pg_catalog view for this which showed the
> relid or schema/relname, and the output values of
> relation_needs_vacanalyze(). If we had that and we documented that
> autovacuum workers work from that list, but they just may have an
> older snapshot of it, then that might help make the score easier to
> document. It would also allow people to question the scores as I
> expect at least some people might not agree with the priorities. That
> would allow us to consider tuning the score calculation if someone
> points out a deficiency with the current calculation.
>
> Also, longer-term, it also doesn't seem that unreasonable that the
> autovacuum worker might want to refresh the tables_to_process once it
> finishes a table and if autovacuum_naptime * $value units of time have
> passed since it was last checked. That would allow the worker to deal
> with and react accordingly when scores have changed significantly
> since it last checked. I mean, it might be days between when
> autovacuum calculates the scores and finally vacuums the table when
> the list is long, of it it was tied up with large tables. Other
> workers may have gotten to some of the tables too, so the score may
> have dropped, but again made its way above the threshold, but to a
> lesser extent.
Agreed on both points.
--
nathan