Re: Two issues leading to discrepancies in FSM data on the standby server - Mailing list pgsql-hackers

From Alexey Makhmutov
Subject Re: Two issues leading to discrepancies in FSM data on the standby server
Date
Msg-id 60b79f39-69e5-4c73-a708-6ef1fd5e7980@postgrespro.ru
Whole thread Raw
In response to Re: Two issues leading to discrepancies in FSM data on the standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
Hi Andrey!

Thank you for the attention to this patch!

> Originally in e981653 was used MarkBufferDirty() but 96ef3b8 flipped to MarkBufferDirtyHint().
> Neither of these commits provided a comment on why this version was chosen. I think if we fix it we must comment
things.

I think that reason of change in 96ef3b8 (changing of 'MarkBufferDirty' 
to 'MarkBufferDirtyHint') may be described in the next commit (9df56f6), 
during the README update:
 > New WAL records cannot be written during recovery, so hint bits set 
during recovery must not dirty the page if the buffer is not already 
dirty, when checksums are enabled.  Systems in Hot-Standby mode may 
benefit from hint bits being set, but with checksums enabled, a page 
cannot be dirtied after setting a hint bit (due to the torn page risk). 
So, it must wait for full-page images containing the hint bit updates to 
arrive from the master.

So, it seems logical, that any changes to the data not protected by the 
WAL (which includes VM and FSM as well) should use MarkBufferDirtyHint, 
which does not set dirty flag during recovery. However, as FSM blocks 
could be just zeroed in case of checksums mismatch, so I think it's 
perfectly fine to use regular MarkBufferDirty here.

I've updated the first patch by adding the comment with explanation of 
the reason for using MarkBufferDirty instead of MarkBufferDirtyHint here.

As for the second issue and the patch - it seems to be resolved in the 
current master by a881cc9, which removed the entire 'heap_xlog_visible' 
method, as all-visibility information is now sent with the 
XLOG_HEAP2_PRUNE_VACUUM_CLEANUP message and its handler already uses 
PageGetHeapFreeSpace. The problem is still relevant for the pre-19 
versions, so I will probably move it to the separate thread in bugs.

Thanks,
Alexey
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: pg_plan_advice
Next
From: Bruce Momjian
Date:
Subject: Re: PG 19 release notes and authors