On Mon, May 19, 2025 at 2:01 PM Tomas Vondra <tomas@vondra.me> wrote:
> For index-only scans, yes.
Great.
> The regular index scan however still have this issue, although it's not
> as visible as for IOS.
We can do somewhat better with plain index scans than my initial v1
prototype, without any major difficulties. There's more low-hanging
fruit.
We could also move the call to BufferGetLSNAtomic (that currently
takes place inside _bt_readpage) over to _bt_drop_lock_and_maybe_pin.
That way we'd only need to call BufferGetLSNAtomic for those leaf
pages that will actually need to have some index tuples returned to
the scan (through the btgettuple interface). In other words, we only
need to call BufferGetLSNAtomic for pages that _bt_readpage returns
"true" for when called. There are plenty of leaf pages that
_bt_readpage will return "false" for, especially during large range
scans, and skip scans.
It's easy to see why this extension to my v1 POC is correct: the whole
point of dropping the leaf page pin is that we don't block VACUUM when
btgettuple returns -- but btgettuple isn't going to return until the
next call to _bt_readpage that returns "true" actually takes place (or
until the whole scan ends).
--
Peter Geoghegan