Hi,
We currently do not set pd_prune_xid (the oldest prunable XID) when
replaying XLOG_HEAP2_PRUNE* records. We've never done this, AFAICT.
Since 8.3, this comment has been in the pruning redo function:
* Note: we don't worry about updating the page's prunability hints.
* At worst this will cause an extra prune cycle to occur soon.
During normal operation, when a page has no prunable tuples, we set
pd_prune_xid to InvalidTransactionId. But during recovery, the old
value is left behind.
When we then set the page all-visible in the VM, the page is marked
all-visible but the prune hint claims there are prunable tuples. On
the standby, this triggers an unnecessary prune cycle of almost all
all-visible pages the next time they are accessed. However, I think
the page being in this confusing state is the bigger problem. It's not
incorrect, but it seems like it could mask actual page corruption
(e.g. when there are dead tuples and we mistakenly set the page
all-visible).
Fixing this would require adding the prune xid to the WAL record.
UPDATE/DELETE WAL records don't have to include the new prune xid
because they set the page prune hint to the xlog record's transaction
ID.
If we don't think the overhead of the extra transaction ID in the WAL
record is worth it, we could set the prune hint to
InvalidTranasctionId during recovery if the page is all-visible. This
would at least avoid that confusing page state.
- Melanie