Re: Vacuum statistics - Mailing list pgsql-hackers
From | Alena Rybakina |
---|---|
Subject | Re: Vacuum statistics |
Date | |
Msg-id | 74fcdd4c-5fcc-411b-8667-87a798d762e1@yandex.ru Whole thread Raw |
In response to | Re: Vacuum statistics (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
List | pgsql-hackers |
Hi all, I’ve prepared an extension that adds vacuum statistics [0] (master branch), and it’s working stably. The attached patch is a core patch that enables this extension to work. Right now, I’m experimenting with a core patch. Specifically, in load_file I can detect whether vacuum_statistics is listed in shared_preload_libraries and, if so, start collecting vacuum statistics in the core. However, I think it would be more reliable to simply add a dedicated hook for vacuum statistics collection in the core. In my view, an extension may be loaded but disabled for vacuum statistics collection - and in that case we shouldn’t gather them. In general, I’m not entirely happy with the current organization. One issue that bothers me is having to scan the entire hash table to provide vacuum statistics for a database, with aggregation. At the moment, the hash table uses (dboid, reloid, type) as the key. This could be improved by introducing another hash table keyed by dboid, with entries containing arrays of the first table’s keys (dboid, reloid, type) (where dboid is either kept or omitted). The idea is that we find the relevant array for a given database and then aggregate its statistics by iterating over the first table using those keys. I’ve started implementing this approach in the main branch of the same repository, but I’m still working out the issues with dynamic memory management. I also have an idea for effectively splitting statistics into “core” and “extra.” Core statistics: For databases (also collected for tables and indexes): delay_time, total_time For tables: pages_scanned, pages_removed, tuples_deleted, vm_new_frozen_pages, vm_new_visible_pages For indexes: tuples_deleted, pages_deleted Extra statistics: For databases (also collected for tables and indexes): total_blks_read, total_blks_dirtied, total_blks_written, blks_fetched, blks_hit, blk_read_time, blk_write_time For tables: recently_dead_tuples, missed_dead_tuples, vm_new_visible_frozen_pages, missed_dead_pages, tuples_frozen WAL statistics (separately for databases and relations):wal_records, wal_fpi, wal_bytes I’ve already started drafting the first implementation, but I still need to carefully handle memory allocation. Additionally, I’m considering letting users define which databases, schemas, or tables/relations should have vacuum statistics collected. I believe this could be valuable for large, high-load systems. For example, the core statistics might show that a particular database is frequently vacuumed - so we could then focus on tracking only that one. Similarly, if certain tables are heavily updated by backends and vacuumed often, we could target those specifically. Conceptually, this would act like a filter, but at this point, it’s just an idea for a future improvement. This is the direction I’m planning to take with the patch. If you have alternative ideas about how to organize the code, I’d be glad to hear them! On 25.09.2025 03:03, Bharath Rupireddy wrote: > Hi, > > On Mon, May 12, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote: >>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is possibleto disable the collection of statistics for relationsh using gucs and >>> this allows us to solve the problem with the memory consumed. >>> >> I think this patch is trying to collect data similar to what we do for >> pg_stat_statements for SQL statements. So, can't we follow a similar >> idea such that these additional statistics will be collected once some >> external module like pg_stat_statements is enabled? That module should >> be responsible for accumulating and resetting the data, so we won't >> have this memory consumption issue. >> >> BTW, how will these new statistics be used to autotune a vacuum? And >> do we need all the statistics proposed by this patch? > Thanks for working on this. I agree with the general idea of having > minimal changes to the core. I think a simple approach would be to > have a hook in heap_vacuum_rel at the end, where vacuum stats are > prepared in a buffer for emitting LOG messages. External modules can > then handle storing, rotating, interpreting, aggregating (per > relation/per database), and exposing the stats to end-users via SQL. > The core can define a common data structure, fill it, and send it to > external modules. I haven't had a chance to read the whole thread or > review the patches; I'm sure this has been discussed. > As for how this may help databases in practice, I think it deserves a separate thread once the vacuum statistics patch is pushed. In short, such statistics are essential to understand the real impact of vacuum on system load. For example: If vacuum runs very frequently on a table or index, this might point to table or index bloat, or to overly aggressive configuration. Conversely, if vacuum freezes or removes very few tuples, it may suggest that vacuum is not aggressive enough, or that delays are set too high. If missed_dead_pages and missed_dead_tuples are high compared to tuples_deleted, that may indicate vacuum can’t obtain a INDEX CLEANUP LOCK or doesn’t retry due to long delays. Statistics related to wraparound activity can also hint that autovacuum settings require adjustment. It’s also possible that this system could be made more automatic in the future, but I haven’t fully worked out how yet. I think that discussion belongs in a separate thread once vacuum statistics themselves are committed. [0] https://github.com/Alena0704/vacuum_statistics/tree/master ------------- Best regards, Alena Rybakina Postgres Professional
Attachment
pgsql-hackers by date: