Re: Vacuum statistics - Mailing list pgsql-hackers

From Alena Rybakina
Subject Re: Vacuum statistics
Date
Msg-id 0e84acdc-ba65-44fa-be1f-4d6a86bca2ac@yandex.ru
Whole thread Raw
In response to Re: Vacuum statistics  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: Vacuum statistics
List pgsql-hackers
On 13.03.2026 14:31, Andrei Lepikhov wrote:

> On 12/3/26 19:10, Alena Rybakina wrote:
>> On 12.03.2026 18:28, Andrei Lepikhov wrote:
>>
>>>
>>> In addition, it makes sense to discuss how these parameters are 
>>> supposed to be used. I see the following use cases:
>>>
>>> 1. Which tables have the most VM churn? - monitoring 
>>> rev_all_visible_pages normalised on the table size and its average 
>>> tuple width might expose the most suspicious tables (in terms of 
>>> table statistics).
>>> 2. DML Skew. Dividing rev_all_visible_pages by the number of tuple 
>>> updates/deletes, normalised by the average table and tuple sizes, 
>>> might indicate whether changes are localised within the table.
>>> 3. IndexOnlyScan effectiveness. Considering the speed of 
>>> rev_all_visible_pages change, normalised to the value of the 
>>> relallvisible statistic, we may detect tables where Index-Only Scan 
>>> might be inefficiently used.
>>>
>>>
>> I agree with all these points and I think we can add it in the 
>> documentation.
> I've updated the status to 'Ready for Committer'. Here’s what that means:
> 1. Patch v30-0001-* is ready to be committed. It’s straightforward 
> enough for this late stage of development. It’s now separate from the 
> original 'vacuum statistics' idea, which helps keep things simpler.
> 2. Patches 0002 and 0003 are not part of this commit set. I’m not sure 
> they should be included in PG19, and we can keep working on them.
>
Completely agree, thank you)

As for patches 0002 and 0003, they still require additional testing.
I'd also like to finish the work on the GUC that allows partial 
statistics collection, as mentioned earlier.

This will require allocating or freeing memory depending on changes to 
the GUC value.

>>
>> In addition, it makes sense to discuss how these parameters are 
>> supposed to be used. I see the following use cases:
>>
>> 1. Which tables have the most VM churn? - monitoring 
>> rev_all_visible_pages normalised on the table size and its average 
>> tuple width might expose the most suspicious tables (in terms of 
>> table statistics).
>> 2. DML Skew. Dividing rev_all_visible_pages by the number of tuple 
>> updates/deletes, normalised by the average table and tuple sizes, 
>> might indicate whether changes are localised within the table.
>> 3. IndexOnlyScan effectiveness. Considering the speed of 
>> rev_all_visible_pages change, normalised to the value of the 
>> relallvisible statistic, we may detect tables where Index-Only Scan 
>> might be inefficiently used.
>
> With the parameter that was included before (pg_class_relallfrozen and 
> relallvisible 
> https://github.com/MasaoFujii/postgresql/commit/99f8f3fbbc8f743290844e8c676d39dad11c5d5d) 
> in the pg_stat_tables, I think I can provide isolation test to prove 
> it - I can use my isolation test 
> vacuum-extending-in-repetable-read.spec that I have added in the 
> extension (ext_vacuum_statistics). What do you think? 

I've prepared the test. Do you think it would make sense to include it 
in 0001?


Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: client_connection_check_interval default value
Next
From: Daniel Gustafsson
Date:
Subject: Re: Change copyObject() to use typeof_unqual