Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date
Msg-id CAAKRu_bAR5uCfjuc06vc_xrZjNCJLs493NgHjTOUDso9qGdE0w@mail.gmail.com
Whole thread Raw
In response to Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On Thu, Jun 26, 2025 at 6:04 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Rebased in light of recent changes on master:

This needed another rebase, and, in light of the discussion in [1],
I've also removed the patch to add heap wrappers for setting pages
all-visible.

More notably, the final patch (0012) in attached v3 allows on-access
pruning to set the VM.

To do this, it plumbs some information down from the executor to the
table scan about whether or not the table is modified by the query. We
don't want to set the VM only to clear it while scanning pages for an
UPDATE or while locking rows in a SELECT FOR UPDATE.

Because we only do on-access pruning when pd_prune_xid is valid, we
shouldn't need much of a heuristic for deciding when to set the VM
on-access -- but I've included one anyway: we only do it if we are
actually pruning or if the page is already dirty and no FPI would be
emitted.

You can see it in action with the following:

create extension pg_visibility;
create table foo (a int, b int) with (autovacuum_enabled=false, fillfactor=90);
insert into foo select generate_series(1,300), generate_series(1,300);
create index on foo (a);
update foo set b = 51 where b = 50;
select * from foo where a = 50;
select * from pg_visibility_map_summary('foo');

The SELECT will set a page all-visible in the VM.
In this patch set, on-access pruning is enabled for sequential scans
and the underlying heap relation in index scans and bitmap heap scans.
This example can exercise any of the three if you toggle
enable_indexscan and enable_bitmapscan appropriately.

From a performance perspective, If you run a trivial pgbench, you can
see far more all-visible pages set in the pgbench_[x] relations with
no noticeable overhead. But, I'm planning to do some performance
experiments to show how this affects our ability to choose index only
scan plans in realistic workloads.

- Melanie

[1] https://www.postgresql.org/message-id/CAAKRu_Yj%3DyrL%2BgGGsqfYVQcYn7rDp6hDeoF1vN453JDp8dEY%2Bw%40mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Improve LWLock tranche name visibility across backends
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Generate random dates/times in a specified range