Hi hackers!
heap_xlog_visible is not bumping heap page LSN when setting all-visible
flag in it.
There is long comment explaining it:
/*
* We don't bump the LSN of the heap page when setting the
visibility
* map bit (unless checksums or wal_hint_bits is enabled, in which
* case we must), because that would generate an unworkable
volume of
* full-page writes. This exposes us to torn page hazards, but
since
* we're not inspecting the existing page contents in any way, we
* don't care.
*
* However, all operations that clear the visibility map bit
*do* bump
* the LSN, and those operations will only be replayed if the
XLOG LSN
* follows the page LSN. Thus, if the page LSN has advanced
past our
* XLOG record's LSN, we mustn't mark the page all-visible, because
* the subsequent update won't be replayed to clear the flag.
*/
But it still not clear for me that not bumping LSN in this place is
correct if wal_log_hints is set.
In this case we will have VM page with larger LSN than heap page,
because visibilitymap_set
bumps LSN of VM page. It means that in theory after recovery we may have
page marked as all-visible in VM,
but not having PD_ALL_VISIBLE in page header. And it violates VM
constraint:
* When we *set* a visibility map during VACUUM, we must write WAL.
This may
* seem counterintuitive, since the bit is basically a hint: if it is
clear,
* it may still be the case that every tuple on the page is visible to all
* transactions; we just don't know that for certain. The difficulty
is that
* there are two bits which are typically set together: the
PD_ALL_VISIBLE bit
* on the page itself, and the visibility map bit. If a crash occurs
after the
* visibility map page makes it to disk and before the updated heap
page makes
* it to disk, redo must set the bit on the heap page. Otherwise, the next
* insert, update, or delete on the heap page will fail to realize that the
* visibility map bit must be cleared, possibly causing index-only scans to
* return wrong answers.