Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers
| From | Chao Li |
|---|---|
| Subject | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |
| Date | |
| Msg-id | 6BC5DBAB-6084-4BB8-8450-52E9648AB021@gmail.com Whole thread Raw |
| In response to | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) (Melanie Plageman <melanieplageman@gmail.com>) |
| List | pgsql-hackers |
> On Dec 20, 2025, at 05:09, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> Attached v29 addresses some feedback and also corrects a small error
> with the assertion I had added in the previous version's 0009.
>
> On Thu, Dec 18, 2025 at 10:38 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>>
>> I’ve done a basic review of patches 1 and 2. Here are some comments
>> which may be somewhat immature, as this is a fairly large change set
>> and I’m new to some parts of the code.
>>
>> 1) Potential stale old_vmbits after VM repair n v2
>
> Good catch! I've fixed this in attached v29.
>
>> 2) Add Assert(BufferIsDirty(buf))
>>
>> Since the patch's core claim is "buffer must be dirty before WAL
>> registration", an assertion encodes this invariant. Should we add:
>>
>> Assert(BufferIsValid(buf));
>> Assert(BufferIsDirty(buf));
>>
>> right before the visibilitymap_set() call?
>
> There are already assertions that will trip in various places -- most
> importantly in XLogRegisterBuffer(), which is the one that inspired
> this refactor.
>
>> The comment at lines:
>>> "The only scenario where it is not already dirty is if the VM was removed…"
>>
>> This phrasing could become misleading after future refactors. Can we
>> make it more direct like:
>>
>>> "We must mark the heap buffer dirty before calling visibilitymap_set(), because it may WAL-log the buffer and
XLogRegisterBuffer()requires it."
>
> I see your point about future refactors missing updating comments like
> this. But, I don't think we are going to refactor the code such that
> we can have PD_ALL_VISIBLE set without the VM bits set more often.
> Also, it is common practice in Postgres to describe very specific edge
> cases or odd scenarios in order to explain code that may seem
> confusing without the comment. It does risk that comment later
> becoming stale, but it is better that future developers understand why
> the code is there.
>
> That being said, I take your point that the comment is confusing. I
> have updated it in a different way.
>
>>> "Even if PD_ALL_VISIBLE is already set, we don't need to worry about unnecessarily dirtying the heap buffer, as it
mustbe marked dirty before adding it to the WAL chain. The only scenario where it is not already dirty is if the VM was
removed..."
>>
>> In this test we now call MarkBufferDirty() on the heap page even when
>> only setting the VM, so the comments claiming “does not need to modify
>> the heap buffer”/“no heap page modification” might be misleading. It
>> might be better to say the test doesn’t need to modify heap
>> tuples/page contents or doesn’t need to prune/freeze.
>
> The point I'm trying to make is that we have to dirty the buffer even
> if we don't modify the page because of the XLOG sub-system
> requirements. And, it may seem like a waste to do that if not
> modifying the page, but the page will rarely be clean anyway. I've
> tried to make this more clear in attached v29.
>
> - Melanie
>
<v29-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch><v29-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch><v29-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch><v29-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch><v29-0005-Move-VM-assert-into-prune-freeze-code.patch><v29-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch><v29-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch><v29-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch><v29-0009-Simplify-heap_page_would_be_all_visible-visibili.patch><v29-0010-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch><v29-0011-Unset-all_visible-sooner-if-not-freezing.patch><v29-0012-Track-which-relations-are-modified-by-a-query.patch><v29-0013-Pass-down-information-on-table-modification-to-s.patch><v29-0014-Allow-on-access-pruning-to-set-pages-all-visible.patch><v29-0015-Set-pd_prune_xid-on-insert.patch>
A few more comments on v29:
1 - 0002 - Looks like since 0002, visibilitymap_set()’s return value is no longer used, so do we need to update the
functionand change return type to void? I remember in some patches, to address Coverity alerts, people had to do
“(void)function_with_a_return_value()”.
2 - 0003
```
+ * Helper to correct any corruption detected on an heap page and its
```
Nit: “an” -> “a”
3 - 0003
```
+static bool
+identify_and_fix_vm_corruption(Relation rel, Buffer heap_buffer,
+ BlockNumber heap_blk, Page heap_page,
+ int nlpdead_items,
+ Buffer vmbuffer,
+ uint8 vmbits)
+{
+ Assert(visibilitymap_get_status(rel, heap_blk, &vmbuffer) == vmbits);
```
Right before this function is called:
```
old_vmbits = visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer);
+ if (identify_and_fix_vm_corruption(vacrel->rel, buf, blkno, page,
+ presult.lpdead_items, vmbuffer,
+ old_vmbits))
```
So, the Assert() is checking if old_vmbits is newly returned from visibilitymap_get_status(), in that case,
identify_and_fix_vm_corruption()can take vmbits as a pointer , and it calls visibilitymap_get_status() to get vmbits
itselfand returns vmbits via the pointer, so that we don’t need to call visibilitymap_get_status() twice.
4 - 0004
```
+ * These are only set if the HEAP_PAGE_PRUNE_UPDATE_VM option is set and
+ * we have attempted to update the VM.
+ */
+ uint8 new_vmbits;
+ uint8 old_vmbits;
```
The comment feels a little confusing to me. "HEAP_PAGE_PRUNE_UPDATE_VM option is set” is a clear indication, but how to
decide"we have attempted to update the VM”? By reading the code:
```
+ prstate->attempt_update_vm =
+ (params->options & HEAP_PAGE_PRUNE_UPDATE_VM) != 0;
```
It’s just the result of HEAP_PAGE_PRUNE_UPDATE_VM being set. So, maybe we don’t the “and” part.
5 - 0004
```
+ * Returns true if one or both VM bits should be set, along with returning the
+ * current value of the VM bits in *old_vmbits and the desired new value of
+ * the VM bits in *new_vmbits.
+ */
+static bool
+heap_page_will_set_vm(PruneState *prstate,
+ Relation relation,
+ BlockNumber heap_blk, Buffer heap_buffer, Page heap_page,
+ Buffer vmbuffer,
+ int nlpdead_items,
+ uint8 *old_vmbits,
+ uint8 *new_vmbits)
+{
+ if (!prstate->attempt_update_vm)
+ return false;
```
old_vmbits and new_vmbits are purely output parameters. So, maybe we should set them to 0 inside this function instead
ofrelying on callers to initialize them.
I think this is a similar case where I raised a comment earlier about initializing presult to {0} in the callers, and
youonly wanted to set presult in heap_page_prune_and_freeze().
6 - 0004
```
@@ -823,13 +975,19 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
MultiXactId *new_relmin_mxid)
{
Buffer buffer = params->buffer;
+ Buffer vmbuffer = params->vmbuffer;
Page page = BufferGetPage(buffer);
+ BlockNumber blockno = BufferGetBlockNumber(buffer);
PruneState prstate;
bool do_freeze;
bool do_prune;
bool do_hint_prune;
+ bool do_set_vm;
bool did_tuple_hint_fpi;
int64 fpi_before = pgWalUsage.wal_fpi;
+ uint8 new_vmbits = 0;
+ uint8 old_vmbits = 0;
+
/* Initialize prstate */
```
Nit: an extra empty line is added.
7 - 0005
```
- * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples, and
- * will return 'all_visible', 'all_frozen' flags to the caller.
+ * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples
```
Nit: a tailing dot is needed in the end of the comment line.
8 - 0005
```
@@ -978,6 +1003,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
Buffer vmbuffer = params->vmbuffer;
Page page = BufferGetPage(buffer);
BlockNumber blockno = BufferGetBlockNumber(buffer);
+ TransactionId vm_conflict_horizon = InvalidTransactionId;
```
I guess the variable name “vm_conflict_horizon” comes from the old "presult->vm_conflict_horizon”. But in the new
logic,this variable is used more generic, for example Assert(debug_cutoff == vm_conflict_horizon). I see 0006 has
renamedto “conflict_xid”, so it’s up to you if or not rename it. But to make the commit self-contained, I’d suggest
renamingit.
9 - 0006
```
@@ -3537,6 +3537,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
{
ItemId itemid;
HeapTupleData tuple;
+ TransactionId dead_after = InvalidTransactionId;
```
This initialization seems to not needed, as HeapTupleSatisfiesVacuumHorizon() will always set a value to it.
10 - 0010
```
+ * there is any snapshot that still consider the newest xid on
```
Nit: consider -> considers
11 - 0011
```
+ * page. If we won't attempt freezing, just unset all-visible now, though.
*/
+ if (!prstate->attempt_freeze)
+ {
+ prstate->all_visible = false;
+ prstate->all_frozen = false;
+ }
```
The comment says “just unset all-visible”, but the code actually also unset all_frozen.
12 - 0012
```
+ /*
+ * RT indexes of relations modified by the query either through
+ * UPDATE/DELETE/INSERT/MERGE or SELECT FOR UPDATE
+ */
+ Bitmapset *es_modified_relids;
```
As we intentionally only want indexes, does it make sense to just name the field es_modified_rtindexes to make it more
explicit.
13 - 0012
```
+ /* If it has a rowmark, the relation is modified */
+ estate->es_modified_relids = bms_add_member(estate->es_modified_relids,
+ rc->rti);
```
I think this comment is a little misleading, because SELECT FOR UPDATE/SHARE doesn’t always modify tuples of the
relation.If a reader not associating this code with this patch, he may consider the comment is wrong. So, I think we
shouldmake the comment more explicit. Maybe rephrase like “If it has a rowmark, the relation may modify or lock heap
pages”.
14 - 0015 - commit message
```
Setting pd_prune_xid on insert can cause a page to be dirtied and
written out when it previously would not have been, affetcting the
```
Typo: affetcting -> affecting
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
pgsql-hackers by date: