Re: Count and log pages set all-frozen by vacuum - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Count and log pages set all-frozen by vacuum
Date
Msg-id CAAKRu_Ze0eL6-O8PrUeW6jM7HCw8YCQbyrY7e+=Kw-ng_7iZnA@mail.gmail.com
Whole thread Raw
In response to Re: Count and log pages set all-frozen by vacuum  (Alastair Turner <minion@decodable.me>)
Responses Re: Count and log pages set all-frozen by vacuum
List pgsql-hackers
Thanks for taking a look, Alastair!

On Thu, Oct 31, 2024 at 5:47 AM Alastair Turner <minion@decodable.me> wrote:
>
> The values returned in a case pages are removed (cases where the empty pages are at the end of the table) are a bit
confusing.
>
> In an example similar to yours, but with a normal vacuum operation, since that seems to be the most useful case for
thisfeature: 
>
> CREATE TABLE the_table (first int, second int) WITH (autovacuum_enabled = false);
> INSERT INTO the_table SELECT generate_series(1,1000), 1;
> DELETE FROM the_table WHERE first > 50;
> VACUUM (VERBOSE) the_table;
>
> pages: 4 removed, 1 remain, 5 scanned (100.00% of total)
> tuples: 950 removed, 50 remain, 0 are dead but not yet removable
> removable cutoff: 763, which was 1 XIDs old when operation ended
> new relfrozenxid: 761, which is 1 XIDs ahead of previous value
> frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
>
> It looks like 4 pages out of 1 are set all-frozen. So there are -3 to scan for the next aggressive vacuum? The four
pageswhich were set to all frozen are the same four which have been removed from the end of the table. 
>
> For an example where the empty pages which are marked all frozen are at the start of the table (deleting values < 950
inthe example), the results are more intuitive 
>
> pages: 0 removed, 5 remain, 5 scanned (100.00% of total)
> tuples: 949 removed, 51 remain, 0 are dead but not yet removable
> removable cutoff: 768, which was 0 XIDs old when operation ended
> new relfrozenxid: 766, which is 1 XIDs ahead of previous value
> frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
>
> Can the pages which are removed from the end of the table not be counted towards those set all frozen to make the
arithmetica bit more intuitive for this edge case? 

This is a good point. It could be confusing to see that 1 page remains
but 4 were set all-frozen in the VM.
From the perspective of the code, this is because each page is set
all-frozen/all-visible in the VM after it is pruned or vacuumed.
Truncating of the end of the table happens at the end of vacuum --
after all pages have been processed. So, these pages are set
all-frozen in the VM.

I actually don't see a good way that we could accurately decrement the
count. We have LVRelState->removed_pages but we have no idea which of
those pages are all-frozen. We could have
visibilitymap_prepare_truncate() count and return to
RelationTruncate() how many of the truncated pages were all-frozen.
But we have no way of knowing how many of those pages were newly
frozen by this vacuum.

And if we try to track it from the other direction, when freezing
pages, we would have to keep track of all the block numbers of pages
in the relation that were empty and set frozen and then when
truncating the relation find the overlap. That sounds hard and
expensive.

It seems a better solution would be to find a way to document it or
phrase it clearly in the log. It is true that these pages were set
all-frozen in the VM. It is just that some of them were later removed.
And we don't know which ones those are. Is there a way to make this
clear?
And, if not, is it worse than not having the feature at all?

- Melanie



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Having problems generating a code coverage report
Next
From: Tom Lane
Date:
Subject: Re: Test to dump and restore objects left behind by regression