On 06/03/2026 20:55, Robert Haas wrote:
> On Tue, Mar 3, 2026 at 11:50 PM Amul Sul <sulamul@gmail.com> wrote:
>> I think the fix will be to correct the wal summary entry that records
>> an incorrect truncation limit for the VM fork. Attached are the
>> patches: 0001 is a refactoring patch that moves the necessary macro
>> definitions from visibilitymap.c to visibilitymap.h to correctly
>> calculate the VM fork limit recorded in the wal summary file, and 0002
>> provides the actual fix.
>
> I don't like exposing those macros to the rest of the system, because
> they're not named very generically.
>
> Also, I don't think that using HEAPBLK_TO_MAPBLOCK() directly is
> correct. That macro returns the visibility map page that contains the
> VM bit for the indicated heap block number, but that's not what we
> need here. For example, if we truncate the heap to a length of 1,
> HEAPBLK_TO_MAPBLOCK() will return 0, but if we set the limit block for
> the VM to zero, that means the VM was truncated away entirely, which
> in this scenario wouldn't be the case. So, instead of computing too
> large a value for the VM's limit block, I think your patch would cause
> us to compute too small a value for the VM's limit block.
>
> The question we need to answer here is: if we entire remove all heap
> blocks >= N, then for what value of M do we remove all visibility map
> blocks >= M? The answer is that if the the N (the heap block limit) is
> a multiple of HEAPBLOCKS_PER_PAGE, then it's just
> N/HEAPBLOCKS_PER_PAGE. Otherwise, it's one more, because we need an
> extra VM page as soon as it's necessary to use at least one bit on
> that page. So, basically, we need to divide by HEAPBLOCKS_PER_PAGE and
> round up.
>
> So here's my attempt at a patch. Instead of exposing
> HEAPBLK_TO_MAPBLOCK() etc., I added a new helper function,
> visibilitymap_truncation_length. It's just a wrapper around a new
> macro HEAPBLK_TO_MAPBLOCK_LIMIT(), but I think keeping the exposed
> symbols having visibilitymap in the name is a good idea. Also, I added
> a test case, and I have verified that this test case passes with the
> fix and fails without it.
For 'master', I wonder if we should change the 'xl_smgr_create' record
format to directly include the new sizes of all of the truncated forks.
It's always felt error-prone to me that the redo function recomputes
those. It's a little more complicated for the redo function, as it needs
to also clear out part of the last remaining page, but that information
could also be included in the WAL record directly.
- Heikki