Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date
Msg-id CAAKRu_YD0ecXeAh+DmJpzQOJwcRzmMyGdcc5W_0pEF78rYSJkQ@mail.gmail.com
Whole thread Raw
In response to Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Kirill Reshke <reshkekirill@gmail.com>)
Responses Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
List pgsql-hackers
Thanks for the review! Updates are in attached v7.

One note on 0022 in the set, which sets pd_prune_xid on insert: the
recently added index-killtuples isolation test was failing with this
patch applied. With the patch, the "access" step reports more heap
page hits than before. After some analysis, it seems one of the heap
pages in kill_prior_tuples table is now being pruned in an earlier
step. Somehow this leads to 4 hits counted instead of 3 (even though
there are only 4 blocks in the relation). I recall Bertrand mentioning
something in some other thread about hits being double counted with
AIO reads, so I'm going to try and go dig that up. But, for now, I've
modified the test -- I believe the patch is only revealing an issue
with instrumentation, not causing a bug.

On Tue, Aug 26, 2025 at 5:58 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> visibilitymap_set_vmbyte is introduced in 0001 and removed in 0012.
> This is strange to me, maybe we can avoid visibilitymap_set_vmbyte in
> first place?

The reason for this is that in the earlier patch I introduce
visibilitymap_set_vmbyte() for one user while other users still use
visibilitymap_set(). But, by the end of the set, all users use
visibillitymap_set_vmbyte(). So I think it makes most sense for it to
be named visibilitymap_set() at that point. Until all users are
committed, the two functions both have to exist and need different
names.

> In 0001:
> should we add "Assert(LWLockHeldByMeInMode(BufferDescriptorGetContentLock(bufHdr),
> LW_EXCLUSIVE));" in visibilitymap_set_vmbyte?

I don't want any operations on the heap buffer (including asserts) in
visibilitymap_set_vmbyte(). The heap block is only provided to look up
the VM bits.

I think your idea is a good one for the existing visibilitymap_set(),
though, so I've added a new patch to the set (0002) which does this. I
also added a similar assertion for the vmbuffer to
visibilitymap_set_vmbyte().

> Also here  `Assert(visibilitymap_pin_ok(BufferGetBlockNumber(buffer),
> vmbuffer));` can be beneficial:

I had omitted this because the same logic is checked inside of
visiblitymap_set_vmbyte() with an error occurring if conditions are
not met. However, since the same is true in visibilitymap_set() and
heap_multi_insert() still asserted visiblitymap_pin_ok(), I've added
it back to my patch set.

> in heap_xlog_multi_insert:
> +
> + visibilitymap_pin(reln, blkno, &vmbuffer);
> + visibilitymap_set_vmbyte(....)
>
> Do we need to pin vmbuffer here? Looks like
> XLogReadBufferForRedoExtended already pins vmbuffer. I verified this
> with CheckBufferIsPinnedOnce(vmbuffer) just before visibilitymap_pin
> and COPY ... WITH (FREEZE true) test.

I thought the reason visibilitymap_set() did it was that it was
possible for the block of the VM corresponding to the block of the
heap to be different during recovery than it was when emitting the
record, and thus we needed the part of visiblitymap_pin() that
released the old vmbuffer and got the new one corresponding to the
heap block.

I can't quite think of how this could happen though.

Assuming it can't happen, then we can get rid of visiblitymap_pin()
(and add visibilitymap_pin_ok()) in both visiblitymap_set_vmbyte() and
visibilitymap_set(). I've done this to visibilitymap_set() in a
separate patch 0001. I would like other opinions/confirmation that the
block of the VM corresponding to the heap block cannot differ during
recovery from that what it was when the record was emitted during
normal operation, though.

> > +#ifdef TRACE_VISIBILITYMAP
> > + elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
> > +#endif
>
> I can see this merely copy-pasted from visibilitymap_set, but maybe
> display "flags" also?

Done in attached.

> 4) visibilitymap_set receives  XLogRecPtr recptr parameters, which is
> set to WAL record lsn during recovery and to InvalidXLogRecPtr
> otherwise. visibilitymap_set manages VM page LSN bases on this recptr
> value (inside function logic). visibilitymap_set_vmbyte behaves
> vise-versa and makes its caller responsible for page LSN management.
> Maybe we should keep these two functions akin to each other?

So, with visibilitymap_set_vmbyte(), the whole idea is to just update
the VM and then leave the WAL logging and other changes to the caller
(like marking the buffer dirty, setting the page LSN, etc). The series
of operations needed to make a persistent change are up to the caller.
visibilitymap_set() is meant to just make sure that the correct bits
in the VM are set for the given heap block.

I looked at ways of making the current visibilitymap_set() API cleaner
-- like setting the heap page LSN with the VM recptr in the caller of
visibilitymap_set() instead. There wasn't a way of doing it that
seemed like enough of an improvement to merit the change.

Not to mention, the goal of the patchset is to remove the current
visibilitymap_set(), so I'm not too worried about parity between the
two functions. They may coexist for awhile, but hopefully today's
visibilitymap_set() will eventually be removed.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: psql: tab-completion support for COPY ... TO/FROM STDIN, STDOUT, and PROGRAM
Next
From: Melanie Plageman
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)