Re: index prefetching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: index prefetching
Date
Msg-id 18334df6-4b9c-4f1b-b8f2-fda58eb428d5@vondra.me
Whole thread Raw
In response to Re: index prefetching  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On 12/18/25 15:45, Andres Freund wrote:
> Hi,
> 
> On 2025-12-18 15:40:59 +0100, Tomas Vondra wrote:
>> The technical reason is that batch_getnext() does this:
>>
>>   /* Delay initializing stream until reading from scan's second batch */
>>   if (priorbatch && !scan->xs_heapfetch->rs && !batchqueue->disabled &&
>>       enable_indexscan_prefetch)
>>       scan->xs_heapfetch->rs =
>>           read_stream_begin_relation(READ_STREAM_DEFAULT, NULL,
>>                                      ....);
>>
>> which means we only create the read_stream (which is what enables the
>> prefetching) only when creating the second batch. And with LIMIT 100 we
>> likely read just a single leaf page (=batch) most of the time, which
>> means no read_stream and thus no prefetching.
> 
> Why is the logic tied to the number of batches, rather the number of items in
> batches? It's not hard to come up with scenarios where having to wait for ~100
> random pages will be the majority of the queries IO wait... It makes sense to
> not initialize readahead if we just fetch an entry or two, but after that?
> 

Because the number of items in a batch does not tell you much about
prefetching either. It does not say how many TIDs (or rather the heap
pages) are already in cache, it does not say what's the access pattern.
It also does not say what distance will the read_stream converge to
(maybe it drops to 1 or 2).

Maybe it's too defensive, of course. I recall we discussed various other
heuristics, but our #1 goal was to not cause regressions against master
(or at least not too many). It doesn't mean we can't improve this later.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: index prefetching
Next
From: Konstantin Knizhnik
Date:
Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks