Re: Vacuum statistics - Mailing list pgsql-hackers

From Alena Rybakina
Subject Re: Vacuum statistics
Date
Msg-id 74fcdd4c-5fcc-411b-8667-87a798d762e1@yandex.ru
Whole thread Raw
In response to Re: Vacuum statistics  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
Hi all,

I’ve prepared an extension that adds vacuum statistics [0] (master 
branch), and it’s working stably. The attached patch is a core patch 
that enables this extension to work.

Right now, I’m experimenting with a core patch. Specifically, in 
load_file I can detect whether vacuum_statistics is listed in 
shared_preload_libraries and, if so, start collecting vacuum statistics 
in the core.
However, I think it would be more reliable to simply add a dedicated 
hook for vacuum statistics collection in the core. In my view, an 
extension may be loaded but disabled for vacuum statistics collection - 
and in that case we shouldn’t gather them.

In general, I’m not entirely happy with the current organization. One 
issue that bothers me is having to scan the entire hash table to provide 
vacuum statistics for a database, with aggregation.

At the moment, the hash table uses (dboid, reloid, type) as the key.
This could be improved by introducing another hash table keyed by dboid, 
with entries containing arrays of the first table’s keys (dboid, reloid, 
type) (where dboid is either kept or omitted).
The idea is that we find the relevant array for a given database and 
then aggregate its statistics by iterating over the first table using 
those keys. I’ve started implementing this approach in the main branch 
of the same repository, but
I’m still working out the issues with dynamic memory management.

I also have an idea for effectively splitting statistics into “core” and 
“extra.”

Core statistics:
For databases (also collected for tables and indexes): delay_time, 
total_time
For tables: pages_scanned, pages_removed, tuples_deleted, 
vm_new_frozen_pages, vm_new_visible_pages
For indexes: tuples_deleted, pages_deleted

Extra statistics:
For databases (also collected for tables and indexes): total_blks_read, 
total_blks_dirtied, total_blks_written, blks_fetched, blks_hit, 
blk_read_time, blk_write_time
For tables: recently_dead_tuples, missed_dead_tuples, 
vm_new_visible_frozen_pages, missed_dead_pages, tuples_frozen

WAL statistics (separately for databases and relations):wal_records, 
wal_fpi, wal_bytes

I’ve already started drafting the first implementation, but I still need 
to carefully handle memory allocation.

Additionally, I’m considering letting users define which databases, 
schemas, or tables/relations should have vacuum statistics collected. I 
believe this could be valuable for large, high-load systems.
For example, the core statistics might show that a particular database 
is frequently vacuumed - so we could then focus on tracking only that 
one. Similarly, if certain tables are heavily updated by backends and
vacuumed often, we could target those specifically. Conceptually, this 
would act like a filter, but at this point, it’s just an idea for a 
future improvement.

This is the direction I’m planning to take with the patch. If you have 
alternative ideas about how to organize the code, I’d be glad to hear them!

On 25.09.2025 03:03, Bharath Rupireddy wrote:
> Hi,
>
> On Mon, May 12, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is
possibleto disable the collection of statistics for relationsh using gucs and
 
>>> this allows us to solve the problem with the memory consumed.
>>>
>> I think this patch is trying to collect data similar to what we do for
>> pg_stat_statements for SQL statements. So, can't we follow a similar
>> idea such that these additional statistics will be collected once some
>> external module like pg_stat_statements is enabled? That module should
>> be responsible for accumulating and resetting the data, so we won't
>> have this memory consumption issue.
>>
>> BTW, how will these new statistics be used to autotune a vacuum? And
>> do we need all the statistics proposed by this patch?
> Thanks for working on this. I agree with the general idea of having
> minimal changes to the core. I think a simple approach would be to
> have a hook in heap_vacuum_rel at the end, where vacuum stats are
> prepared in a buffer for emitting LOG messages. External modules can
> then handle storing, rotating, interpreting, aggregating (per
> relation/per database), and exposing the stats to end-users via SQL.
> The core can define a common data structure, fill it, and send it to
> external modules. I haven't had a chance to read the whole thread or
> review the patches; I'm sure this has been discussed.
>
As for how this may help databases in practice, I think it deserves a 
separate thread once the vacuum statistics patch is pushed.

In short, such statistics are essential to understand the real impact of 
vacuum on system load.

For example:
If vacuum runs very frequently on a table or index, this might point to 
table or index bloat, or to overly aggressive configuration.
Conversely, if vacuum freezes or removes very few tuples, it may suggest 
that vacuum is not aggressive enough, or that delays are set too high.
If missed_dead_pages and missed_dead_tuples are high compared to 
tuples_deleted, that may indicate vacuum can’t obtain a INDEX CLEANUP 
LOCK or doesn’t retry due to long delays.
Statistics related to wraparound activity can also hint that autovacuum 
settings require adjustment.

It’s also possible that this system could be made more automatic in the 
future, but I haven’t fully worked out how yet. I think that discussion 
belongs in a separate thread once vacuum statistics themselves are 
committed.

[0] https://github.com/Alena0704/vacuum_statistics/tree/master

-------------

Best regards,
Alena Rybakina
Postgres Professional
Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench
Next
From: Michael Banck
Date:
Subject: Re: GNU/Hurd portability patches