Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

From Chao Li
Subject Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date
Msg-id C3AB3F5B-626E-4AAA-9529-23E9A20C727F@gmail.com
Whole thread Raw
In response to Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
List pgsql-hackers

> On Mar 3, 2026, at 23:52, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
>
>> Otherwise, if prstate->pagefrz.FreezePageConflictXid is still possibly be InvalidTransactionId, then the Assert
shouldbe changed to something like: 
>>
>> Assert(prstate->pagefrz.FreezePageConflictXid == InvalidTransactionId ||
>>  TransactionIdPrecedesOrEquals(prstate->pagefrz.FreezePageConflictXid, prstate->cutoffs->OldestXmin)
>
> This is covered by TransactionIdPrecedesOrEquals because
> InvalidTransactionId is 0. We assume that in many places throughout
> the code.
>

I understood that TransactionIdPrecedesOrEquals(InvalidTransactionId, prstate->cutoffs->OldestXmin) is true, but that
wouldleave an impression to code readers that prstate->pagefrz.FreezePageConflictXid could not be InvalidTransactionId.
ThusI think my version explicitly tells that prstate->pagefrz.FreezePageConflictXid could be InvalidTransactionId at
thepoint. 


>> I will continue with 0005 tomorrow.
>

4 - 0005
```
  * Caller must have pin on the buffer, and must *not* have a lock on it.
  */
 void
-heap_page_prune_opt(Relation relation, Buffer buffer)
+heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
```

I don’t see why vmbuffer has to be of pointer type. Buffer type is underlying int, I checked the last commit, vmbuffer
onlypasses in data into the function without passing out anything. 

As we add the new parameter vmbuffer, though it’s not used in this commit, I think it’d be better to update the header
committo explain what this parameter will do. 

5  - 0006
```
+ *
+ * heap_fix_vm_corruption() makes changes to the VM and, potentially, the heap
+ * page, but it does not need to be done in a critical section because
+ * clearing the VM is not WAL-logged.
+ */
+static void
+heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
```

Nit: why the last paragraph of the header comments uses the function name instead of “this function”? Looks like a
copy-pasto.

6 - 0006
```
+        if (prstate->lpdead_items > 0)
+        {
+            ereport(WARNING,
+                    (errcode(ERRCODE_DATA_CORRUPTED),
+                     errmsg("LP_DEAD item found on page marked as all-visible"),
+                     errdetail("relation \"%s\", page %u, tuple %u",
+                               RelationGetRelationName(prstate->relation),
+                               prstate->block, offnum)));
+        }
+        else
+        {
+            ereport(WARNING,
+                    (errcode(ERRCODE_DATA_CORRUPTED),
+                     errmsg("tuple not visible to all found on page marked as all-visible"),
+                     errdetail("relation \"%s\", page %u, tuple %u",
+                               RelationGetRelationName(prstate->relation),
+                               prstate->block, offnum)));
+        }
```

I recently just learned that a detail message should use complete sentences, and end each with a period, and capitalize
thefirst word of sentences. See https://www.postgresql.org/docs/current/error-style-guide.html. 

7 - 0006
```
+    else if (prstate->vmbits & VISIBILITYMAP_VALID_BITS)
+    {
+        /*
+         * As of PostgreSQL 9.2, the visibility map bit should never be set if
+         * the page-level bit is clear.  However, it's possible that the bit
+         * got cleared after heap_vac_scan_next_block() was called, so we must
+         * recheck with buffer lock before concluding that the VM is corrupt.
+         */
+        ereport(WARNING,
+                (errcode(ERRCODE_DATA_CORRUPTED),
+                 errmsg("page %u in \"%s\" is not marked all-visible but visibility map bit is set",
+                        prstate->block,
+                        RelationGetRelationName(prstate->relation))));
+    }
```

The comment says “we must recheck with buffer lock before…”, but it only log a warning message. Is the comment stale?

8 - 0007
```
+static void
+heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
+{
+    OffsetNumber maxoff = PageGetMaxOffsetNumber(prstate->page);
+    Page        page = prstate->page;
+
+    Assert(prstate->vmbits & VISIBILITYMAP_ALL_FROZEN ||
+           (prstate->vmbits & VISIBILITYMAP_ALL_VISIBLE &&
+            !prstate->attempt_freeze));
+
+    /* We'll fill in presult for the caller */
+    memset(presult, 0, sizeof(PruneFreezeResult));
+
+    /*
+     * Since the page is all-visible, a count of the normal ItemIds on the
+     * page should be sufficient for vacuum's live tuple count.
+     */
+    for (OffsetNumber off = FirstOffsetNumber;
+         off <= maxoff;
+         off = OffsetNumberNext(off))
+    {
+        if (ItemIdIsNormal(PageGetItemId(page, off)))
+            prstate->live_tuples++;
+    }
+
+    presult->live_tuples = prstate->live_tuples;
+
+    /* Clear any stale prune hint */
+    if (TransactionIdIsValid(PageGetPruneXid(page)))
+    {
+        PageClearPrunable(page);
+        MarkBufferDirtyHint(prstate->buffer, true);
+    }
+
+    presult->vmbits = prstate->vmbits;
+
+    if (!PageIsEmpty(page))
+        presult->hastup = true;
+}
```

* Given this function has done PageIsEmpty(page), that that is true, we don’t need to count live_tuples, right? That
couldbe a tiny optimization. 
* I see heap_page_bypass_prune_freeze() is only called in one place and immediately after prune_freeze_setup() and
heap_fix_vm_corruption(),so prstate->vmbits must be 0, so do we need to do presult->vmbits = prstate->vmbits;? 
* Do we need to set all_visible and all_frozen to presult?

0008 LGTM

I will continue with 0009 tomorrow.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Xuneng Zhou
Date:
Subject: Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
Next
From: Alexander Lakhin
Date:
Subject: Re: BUG: Former primary node might stuck when started as a standby