Re: index prefetching - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: index prefetching
Date
Msg-id 77cba7d2-5951-4f16-bfb8-f02669017747@garret.ru
Whole thread Raw
In response to Re: index prefetching  (Tomas Vondra <tomas@vondra.me>)
Responses Re: index prefetching
List pgsql-hackers
On 17/12/2025 9:54 PM, Tomas Vondra wrote:
> On 12/17/25 20:30, Andres Freund wrote:
>> Hi,
>>
>> On 2025-12-17 13:49:43 -0500, Peter Geoghegan wrote:
>>> On Wed, Dec 17, 2025 at 12:19 PM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
>>>> Moreover with `enable_indexscan_prefetch=off` results are the same.
>>> It's quite unlikely that the current heuristics that trigger
>>> prefetching would have ever allowed any prefetching, for queries such
>>> as these.
>>>
>>> The exact rule right now is that we don't even begin prefetching until
>>> we've already read at least one index leaf page, and have to read
>>> another one. So it's impossible to use prefetching with a LIMIT of 1,
>>> with queries such as these. It's highly unlikely that you'd see any
>>> benefits from prefetching even with LIMIT 100 (usually we wouldn't
>>> even begin prefetching).
>> Note that due to the tuple size and fillfactor in Konstantin's workload, there
>> will be one tuple per page... That should allow for some prefetching.
>>
> Yes, but that's in the heap. The mechanism Peter described is about leaf
> pages in the index, and the index has the usual fillfactor. So there'll
> be many index entries per leaf.
>
I slightly change my benchmark setup:

create table t (pk integer, sk integer, payload text default repeat('x', 
1000)) with (fillfactor=10);
insert into t values (generate_series(1,10000000),random()*10000000);
create index on t(sk);

select.sql:

\set sk random(1, 10000000)
select * from t where sk >= :sk order by sk limit N;

You are right. There is almost no effect of prefetch for limit=100, but 
~2x times improvement for limit=1000:

eio\limit       1      100   1000
  10          11102    142    28
   0           11419    137    14

master:
limit              1     100   1000
                11480   130      13

One of the motivation of my experiments was to check that there is no 
degrade of performance because of batching.
And it is nice that there is no performance penalty here.
Still it is not quite clear to me why there is no any positive effect 
for LIMIT 100.
Reading 100 random heap pages definitely should take advantages of AIO.
We have also implemented prefetching for index only scan in Neon and 
here effect for similar query is quite noticeable (~3x times).
But in Neon architecture prices of IO is much higher because requires 
network communication with page server.




pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Don't cast away const where possible
Next
From: Heikki Linnakangas
Date:
Subject: Re: Fix possible 'unexpected data beyond EOF' on replica restart