Understanding when VM record needs snapshot conflict horizon - Mailing list pgsql-hackers

From Melanie Plageman
Subject Understanding when VM record needs snapshot conflict horizon
Date
Msg-id CAAKRu_bWK8dk-ttmLnkAc05v=7HEbQVkCVWKsnWX6Rj2t5A-aw@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi,

I'm trying to understand when the visibility map WAL record
(xl_heap_visible) needs to include a snapshot conflict horizon.
Currently, when emitting a xl_heap_visible record after phase I of
vacuum, we include a snapshot conflict horizon if the page is being
newly set all-visible in the VM.

We do not include a snapshot conflict horizon in the xl_heap_visible
record if we are newly setting an already all-visible page all-frozen.

I thought this was because if we are setting a page all-visible in the
VM, then we are likely also setting the page level hint PD_ALL_VISIBLE
and thus are likely modifying the page (and perhaps doing so without
emitting WAL), so we should include a conflict horizon in the
subsequent xl_heap_visible record to avoid recovery conflicts. There
is no page-level hint about being all-frozen.

However, there is a comment in the code that says we don't need to
include a conflict horizon when setting an already all-visible page
all-frozen because the snapshot conflict horizon sufficient to make
everything safe for REDO was logged when the page's tuples were
frozen.

That doesn't make sense to me because:
1) isn't it possible that a page was entirely frozen but not set all
frozen in the VM for some reason or other and we didn't actually
freeze any tuples in order to set the page all-frozen in the VM and
2) if our inclusion of a cutoff_xid when freezing tuples is what makes
it safe to omit it from the VM update, then wouldn't that be true if
we included a cutoff_xid when pruning a page in a way that rendered it
all-visible too?

For context, I'm writing a patch to add VM update redo to the
xl_heap_prune record, and, in some cases, the record will only contain
an update to the VM and I'm trying to determine when I need a snapshot
conflict horizon in the record.

- Melanie



pgsql-hackers by date:

Previous
From: Matthew Sterrett
Date:
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Next
From: Masahiko Sawada
Date:
Subject: Re: POC: Parallel processing of indexes in autovacuum