Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

From Chao Li
Subject Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date
Msg-id EEA21113-8A6B-4CB3-BE49-6299F513469A@gmail.com
Whole thread Raw
In response to Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers

> On Jan 7, 2026, at 01:31, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Tue, Jan 6, 2026 at 4:40 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>>
>>> <v32-0014-Pass-down-information-on-table-modification-to-s.patch>
>>
>> I've tried to take an attempt to review some patches of this patchset. It's huge and mostly polished.
>
> I've added attributed your review on the patches you specifically
> mention here (and from previous emails you sent). Let me know if there
> are other patches you reviewed that you did not mention.
>
>> In a step "Pass down information on table modification to scan node" you pass SO_HINT_REL_READ_ONLY flag in
IndexNext()and BitmapTableScanSetup(), but not in IndexNextWithReorder() and IndexOnlyNext(). Is there a reason why
indexscans with ordering cannot use on-access VM setting? 
>
> Great point, I simply hadn't tested those cases and didn't think to
> add them. I've added them in attached v33.
>
> While looking at other callers of index_beginscan(), I was wondering
> if systable_beginscan() and systable_beginscan_ordered() should ever
> pass SO_HINT_REL_READ_ONLY. I guess we would need to pass if the
> operation is read-only above the index_beginscan() -- I'm not sure if
> we always know in the caller of systable_beginscan() whether this
> operation will modify the catalog. That seems like it could be a
> separate project, though, so maybe it is better to say this feature is
> just for regular tables.
>
> As for the other cases: We don't have the relation range table index
> in check_exclusion_or_unique_constraints(), so I don't think we can do
> it there.
>
> And I think that the other index scan cases like in replication code
> or get_actual_variable_endpoint() are too small to be worth it, don't
> have the needed info, or don't do on-access pruning (bc of the
> snapshot type they use).
>
>> Also, comment about visibilitymap_set() says "Callers that log VM changes separately should use visibilitymap_set()"
asif visibilitymap_set() is some other function. 
>
> Ah, yes, I forgot to remove that when I removed the old
> visibilitymap_set() and made visibilitymap_set_vmbits() into
> visiblitymap_set(). Done in v33.
>
> - Melanie
>
<v33-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch><v33-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch><v33-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch><v33-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch><v33-0005-Move-VM-assert-into-prune-freeze-code.patch><v33-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch><v33-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch><v33-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch><v33-0009-Simplify-heap_page_would_be_all_visible-visibili.patch><v33-0010-Remove-table_scan_analyze_next_tuple-unneeded-pa.patch><v33-0011-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch><v33-0012-Unset-all_visible-sooner-if-not-freezing.patch><v33-0013-Track-which-relations-are-modified-by-a-query.patch><v33-0014-Pass-down-information-on-table-modification-to-s.patch><v33-0015-Allow-on-access-pruning-to-set-pages-all-visible.patch><v33-0016-Set-pd_prune_xid-on-insert.patch>

I see the same problem in 0009 and 0010:

0009
```
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3570,6 +3570,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
     {
         ItemId        itemid;
         HeapTupleData tuple;
+        TransactionId dead_after;

         /*
          * Set the offset number so that we can display it along with any
@@ -3609,12 +3610,14 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,

         /* Visibility checks may do IO or allocate memory */
         Assert(CritSectionCount == 0);
-        switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf))
+        switch (HeapTupleSatisfiesVacuumHorizon(&tuple, buf, &dead_after))
         {
             case HEAPTUPLE_LIVE:
                 {
                     TransactionId xmin;

+                    Assert(!TransactionIdIsValid(dead_after));
+
                     /* Check comments in lazy_scan_prune. */
                     if (!HeapTupleHeaderXminCommitted(tuple.t_data))
                     {
@@ -3647,8 +3650,10 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
                 }
                 break;

-            case HEAPTUPLE_DEAD:
             case HEAPTUPLE_RECENTLY_DEAD:
+                Assert(TransactionIdIsValid(dead_after));
+                /* FALLTHROUGH */
```

0010:
```
 static bool
-heapam_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
+heapam_scan_analyze_next_tuple(TableScanDesc scan,
                                double *liverows, double *deadrows,
                                TupleTableSlot *slot)
 {
@@ -1047,6 +1047,7 @@ heapam_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
         ItemId        itemid;
         HeapTuple    targtuple = &hslot->base.tupdata;
         bool        sample_it = false;
+        TransactionId dead_after;

         itemid = PageGetItemId(targpage, hscan->rs_cindex);

@@ -1069,16 +1070,20 @@ heapam_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
         targtuple->t_data = (HeapTupleHeader) PageGetItem(targpage, itemid);
         targtuple->t_len = ItemIdGetLength(itemid);

-        switch (HeapTupleSatisfiesVacuum(targtuple, OldestXmin,
-                                         hscan->rs_cbuf))
+        switch (HeapTupleSatisfiesVacuumHorizon(targtuple,
+                                                hscan->rs_cbuf,
+                                                &dead_after))
         {
             case HEAPTUPLE_LIVE:
                 sample_it = true;
                 *liverows += 1;
                 break;

-            case HEAPTUPLE_DEAD:
             case HEAPTUPLE_RECENTLY_DEAD:
+                Assert(TransactionIdIsValid(dead_after));
+                /* FALLTHROUGH */
```

I believe the reason why we add Assert(TransactionIdIsValid(dead_after)) under HEAPTUPLE_RECENTLY_DEAD is to ensure
thatwhen HeapTupleSatisfiesVacuumHorizon() returns HEAPTUPLE_RECENTLY_DEAD, dead_after must be set. So the goal of the
assertis to catch bugs of HeapTupleSatisfiesVacuumHorizon(). 

From this perspective, I now feel dead_after should be initialized to InvalidTransactionId. Otherwise, say
HeapTupleSatisfiesVacuumHorizon()has a bug and miss to set dead_after, then the assert mostly like won’t be fired,
becauseit holds a random value, most likely not be 0. 

I know this comment conflicts to one of my previous comments, sorry about that. As I read this patch once and again, I
amgetting more understanding to it.  

0014
```
+    /* set if the query doesn't modify the rel */
+    SO_HINT_REL_READ_ONLY = 1 << 10,
```

Nit: I think it’s better to replace “rel” to “relation”. For a function comment, if there is a parameter named “rel”,
thenwe can use it to refer to the parameter, without such a context, I guess here a while word is better. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: [PATCH] Expose checkpoint reason to completion log messages.
Next
From: Neil Chen
Date:
Subject: Re: Is this a memory leak in libpqrcv_processTuples()?