Re: New criteria for autovacuum - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: New criteria for autovacuum
Date
Msg-id CAAKRu_Z100bhmwWP+WM+rutfL0KMzWuOTPCUf=Lqx8QEKgKiZw@mail.gmail.com
Whole thread Raw
In response to Re: New criteria for autovacuum  (Konstantin Knizhnik <knizhnik@garret.ru>)
Responses Re: New criteria for autovacuum
List pgsql-hackers
On Sat, Apr 5, 2025 at 2:02 AM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
>
> A more targeted solution to your specific problem would be to update
> the visibility map on access. Then, the first time you have to fetch
> that heap page, you could mark it all-visible (assuming the long
> running transaction has ended) and then the next index-only scan
> wouldn't have to do the same heap fetch. It doesn't add any overhead
> in the case that the long running transaction has not ended, unlike
> trying to trigger another autovacuum.
>
> I really considered this alternative when thinking about the solution of the problem. It is more consistent with hint
bitapproach. 
> I declined it in favor of this solution because of the following reasons:
>
> 1. Index-only scan holds read-only lock on heap page. In order to update it, we need to upgrade this lock to
exclusive.
> 2. We need to check visibility for all elements on the page (actually do something like `heap_page_is_all_visible`)
butif there is large number elements at the page it can be quite expensive. And I afraid that it can slowdown speed of
index-onlyscan. Yes, only in "slow case" - when it has to access heap to perform visibility check. But still it may be
notacceptable. Also it is not clear how to mark page as already checked. Otherwise we will have to repeat this check
forall tids referring this page. 
> 3. `heap_page_is_all_visible` is local to lazyvaccum.c. So  to use it in index-only scan we either have to make it
global,either cut&paste it's code. Just removing "static" is not possible, because it is using local `LVRelState`, so
somerefactoring is needed in any case. 
> 4. We need to wal-log VM page and heap pages in case of setting all-visible bit. It is quite expensive operation.
Doingit inside index-only scan can significantly increase time of select. Certainly Postgres is not a real-time DBMS.
Butstill it is better to provide some predictable query execution time. This is why I think that it is better to do
suchworkt in background (in vaccum). 

I wasn't thinking about adding a new VM setting functionality to index
only scan in particular. heapam_index_fetch_tuple() already calls
heap_page_prune_opt() which will do pruning under certain conditions.
I was thinking that we start updating the VM after pruning in the
on-access case too (not just when pruning is invoked by vacuum).

If you look at the callers of heap_page_prune_opt(), it includes
bitmap heap scan and also heap_prepare_pagescan() which is invoked as
part of sequential scans and other operations.

- Melanie



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: Parallel heap vacuum
Next
From: Chapman Flack
Date:
Subject: Re: FmgrInfo allocation patterns (and PL handling as staged programming)