Re: index prefetching - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: index prefetching
Date
Msg-id CAH2-Wz=HJc+QV2AZ9mUY43aKL+n+a1JQ-7OGE=MOkqSAtoKJug@mail.gmail.com
Whole thread
In response to Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: index prefetching
List pgsql-hackers
On Tue, Mar 17, 2026 at 1:37 PM Peter Geoghegan <pg@bowt.ie> wrote:
> v14 has already bitrot, so I'm posting this new v15 to keep the
> patchset applying cleanly.

And now v15 has bitrot. So here's a new v16. There are 2 notable updates here.

Most notably, I decided to take a more conservative approach to fixing
the issues with the selfuncs.c mechanism that gives up when
get_actual_variable_range does too much work during planning. Rather
than inventing a new approach that bases the decision to give up on
the number of leaf pages accessed, which has certain advantages [1], I
have decided to cut scope and more or less reimplement the existing
VISITED_PAGES_LIMIT mechanism on the heapam side. This was a pragmatic
choice; there is no clear consensus on whether the design I proposed
for this will be acceptable, and we lack the time to resolve those
issues now.

To recap, selfuncs.c can no longer be the place where
VISITED_PAGES_LIMIT is implemented, because the new slot-based
interface won't return until it finds a visible tuple. The new
higher-level interface simply cannot support callers that want to
count the number of heap pages examined themselves.

The other notable change aims to improve performance for cached
workloads through more aggressive specialization of callback routines
like heapam_index_getnext_slot. There are now a total of 4 versions:

1. One for index-only scans + amgetbatch.
2. One for one for index-only scans + amgettuple.
3. One for plain index scans + amgetbatch.
4. One for plain index scans + amgettuple.

I've also aggressively used pg_attribute_always_inline (without this
specialization, performance ends up worse rather than better).

There might be some concerns about the distributed costs of increased
code size. But this revision is the fastest so far across all
available metrics (though of course the specialization work is
focussed on cached query performance specifically).

[1] (see
https://www.postgresql.org/message-id/flat/CAH2-Wzkt1WkKp4VRJu3qHfmKXc8W%2BXYv1RXg5d2d3fSvAeO%3Drg%40mail.gmail.com)
--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: "Greg Burd"
Date:
Subject: Re: another autovacuum scheduling thread
Next
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Don't leave behind files in src dir in 007_multixact_conversion.