On Wed, 5 Mar 2025 at 10:04, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 28/02/2025 03:53, Peter Geoghegan wrote:
> > On Sat, Feb 8, 2025 at 8:47 AM Michail Nikolaev
> > <michail.nikolaev@gmail.com> wrote:
> >> Just some commit messages + few cleanups.
> >
> > I'm worried about this:
> >
> > +These longer pin lifetimes can cause buffer exhaustion with messages like "no
> > +unpinned buffers available" when the index has many pages that have similar
> > +ordering; but future work can figure out how to best work that out.
> >
> > I think that we should have some kind of upper bound on the number of
> > pins that can be acquired at any one time, in order to completely
> > avoid these problems. Solving that problem will probably require GiST
> > expertise that I don't have right now.
>
> +1. With no limit, it seems pretty easy to hold thousands of buffer pins
> with this.
>
> The index can set IndexScanDesc->xs_recheck to indicate that the quals
> must be rechecked. Perhaps we should have a similar flag to indicate
> that the visibility must be rechecked.
>
> Matthias's earlier patch
> (https://www.postgresql.org/message-id/CAEze2Wg1kbpo_Q1%3D9X68JRsgfkyPCk4T0QN%2BqKz10%2BFVzCAoGA%40mail.gmail.com)
> had a more complicated mechanism to track the pinned buffers. Later
> patch got rid of that, which simplified things a lot. I wonder if we
> need something like that, after all.
I dropped that because it effectively duplicates the current
per-backend pin tracking system. Adding it back in will probably
complicate matters by a lot again.
> Here's a completely different line of attack: Instead of holding buffer
> pins for longer, what if we checked the visibility map earlier? We could
> check the visibility map already when we construct the
> GISTSearchHeapItem, and set a flag in IndexScanDesc to tell
> IndexOnlyNext() that we have already done that. IndexOnlyNext() would
> have three cases:
I don't like integrating a heap-specific thing like VM_ALL_VISIBLE()
to indexes, but given that IOS code already uses that exact code my
dislike is not to the point of a -1. I'd like it better if we had a
TableAM API for higher-level visibility checks (e.g.
table_tids_could_be_invisible?()) which gives us those responses
instead; dropping the requirement to maintain VM in pg's preferred
format to support efficient IOS.
I am a bit worried about even more random IO happening before we've
returned even a single tuple, but that's probably much less of an
issue than "unlimited pins".
With VM-checking in the index, we would potentially have another
benefit: By checking all tids on the page at once, we can deduplicate
and reduce the VM lookups. The gains might not be all that impressive,
but could be significant in certain hot cases.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)