Re: index prefetching - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: index prefetching
Date
Msg-id CAH2-WzkqnVGLEQ31W1vm8T_uzy-ma-6A8QL-C56=0QUqs12b=Q@mail.gmail.com
Whole thread Raw
In response to Re: index prefetching  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: index prefetching
List pgsql-hackers
On Mon, Nov 11, 2024 at 1:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
> That makes sense from the point of view of working with the btree code
> itself, but from a system-wide perspective, it's weird to pretend like
> the pins don't exist or don't matter just because a buffer lock is
> also held.

I can see how that could cause confusion. If you're working on nbtree
all day long, it becomes natural, though. Both points are true, and
relevant to the discussion.

I prefer to over-communicate when discussing these points -- it's too
easy to talk past each other here. I think that the precise reasons
why the index AM does things with buffer pins will need to be put on a
more rigorous and formalized footing with Tomas' patch. The different
requirements/safety considerations will have to be carefully teased
apart.

> I had actually forgotten that the btree code tends to
> pin+lock together; now that you mention it, I remember that I knew it
> at one point, but it fell out of my head a long time ago...

The same thing appears to mostly be true of hash, which mostly uses
_hash_getbuf + _hash_relbuf (hash's idiosyncratic use of cleanup locks
notwithstanding).

To be fair it does look like GiST's gistdoinsert function holds onto
multiple buffer pins at a time, for its own reasons -- index AM
reasons. But this looks to be more or less an optimization to deal
with navigating the tree with a loose index order, where multiple
descents and ascents are absolutely expected. (This makes it a bit
like the nbtree "drop lock but not pin" case that I mentioned in my
last email.)

It's not as if these gistdoinsert buffer pins persist across calls to
amgettuple, though, so for the purposes of this discussion about the
new batch API to replace amgettuple they are not relevant -- they
don't actually undermine my point. (Though to be fair their existence
does help to explain why you found my characterization of buffer pins
as irrelevant to index AMs confusing.)

The real sign that what I said is generally true of index AMs is that
you'll see so few calls to
LockBufferForCleanup/ConditionalLockBufferForCleanup. Only hash calls
ConditionalLockBufferForCleanup at all (which I find a bit weird).
Both GiST and SP-GiST call neither functions -- even during VACUUM. So
GiST and SP-GiST make clear that index AMs (that support only MVCC
snapshot scans) can easily get by without any use of cleanup locks
(and with no externally significant use of buffer pins).

> > I think that this is exactly what I propose to do, said in a different
> > way. (Again, I wouldn't have expressed it in this way because it seems
> > obvious to me that buffer pins don't have nearly the same significance
> > to an index AM as they do to heapam -- they have no value in
> > protecting the index structure, or helping an index scan to reason
> > about concurrency that isn't due to a heapam issue.)
> >
> > Does that make sense?
>
> Yeah, it just really throws me for a loop that you're using "pin" to
> mean "pin at a time when we don't also hold a lock."

I'll try to be more careful about that in the future, then.

> The fundamental
> purpose of a pin is to prevent a buffer from being evicted while
> someone is in the middle of looking at it, and nothing that uses
> buffers can possibly work correctly without that guarantee. Everything
> you've written in parentheses there is, AFAICT, 100% wrong if you mean
> "any pin" and 100% correct if you mean "a pin held without a
> corresponding lock."

I agree.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: index prefetching
Next
From: Heikki Linnakangas
Date:
Subject: Re: Offsets of `struct Port` are no longer constant