Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id 3b8edd50-6062-b3b8-4001-4f41c35e665f@amazon.com
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Andres Freund <andres@anarazel.de>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
List pgsql-hackers
On 4/3/22 12:05 PM, Andres Freund wrote:
> While I was writing the above I, again, realized that it'd be awfully nice to
> have some accumulated stats about (auto-)vacuum's effectiveness. For us to get
> feedback about improvements more easily and for users to know what aspects
> they need to tune.
>
> Knowing how many times a table was vacuumed doesn't really tell that much, and
> requiring to enable log_autovacuum_min_duration and then aggregating those
> results is pretty painful (and version dependent).
>
> If we just collected something like:
> - number of heap passes
> - time spent heap vacuuming
> - number of index scans
> - time spent index vacuuming
> - time spent delaying
The number of passes would let you know if maintenance_work_mem is too 
small (or to stop killing 187M+ tuples in one go). The timing info would 
give you an idea of the impact of throttling.
> - percentage of non-yet-removable vs removable tuples

This'd give you an idea how bad your long-running-transaction problem is.

Another metric I think would be useful is the average utilization of 
your autovac workers. No spare workers means you almost certainly have 
tables that need vacuuming but have to wait. As a single number, it'd 
also be much easier for users to understand. I'm no stats expert, but 
one way to handle that cheaply would be to maintain an 
engineering-weighted-mean of the percentage of autovac workers that are 
in use at the end of each autovac launcher cycle (though that would 
probably not work great for people that have extreme values for launcher 
delay, or constantly muck with launcher_delay).

>
> it'd start to be a heck of a lot easier to judge how well autovacuum is
> coping.
>
> If we tracked the related pieces above in the index stats (or perhaps
> additionally there), it'd also make it easier to judge the cost of different
> indexes.
>
> - Andres
>
>



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Intermittent buildfarm failures on wrasse
Next
From: Peter Geoghegan
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations