Re: index prefetching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: index prefetching
Date
Msg-id e91e1952-3867-4af6-8dc1-c0b7c2506488@vondra.me
Whole thread Raw
In response to Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: index prefetching
List pgsql-hackers

On 7/23/25 02:39, Peter Geoghegan wrote:
> On Tue, Jul 22, 2025 at 8:08 PM Andres Freund <andres@anarazel.de> wrote:
>> My response was specific to Tomas' comment that for many queries, which tend
>> to be more complicated than the toys we are using here, there will be CPU
>> costs in the query.
> 
> Got it. That makes sense.
> 
>>                     cheaper query       expensive query
>> simple readahead    8723.209 ms         10615.232 ms
>> complex readahead   5069.438 ms          8018.347 ms
>>
>> Obviously the CPU overhead in this example didn't completely eliminate the IO
>> bottleneck, but sure reduced the difference.
> 
> That's a reasonable distinction, of course.
> 
>> If your assumption is that real queries are more CPU intensive that the toy
>> stuff above, e.g. due to joins etc, you can see why the really attained IO
>> depth is lower.
> 
> Right.
> 
> Perhaps I was just repeating myself. Tomas seemed to be suggesting
> that cases where we'll actually get a decent and completely worthwhile
> improvement with the complex patch would be naturally rare, due in
> part to these effects with CPU overhead. I don't think that that's
> true at all.
> 
>> Btw, something with the batching is off with the complex patch.  I was
>> wondering why I was not seing 100% CPU usage while also not seeing very deep
>> queues - and I get deeper queues and better times with a lowered
>> INDEX_SCAN_MAX_BATCHES and worse with a higher one.
> 
> I'm not at all surprised that there'd be bugs like that. I don't know
> about Tomas, but I've given almost no thought to
> INDEX_SCAN_MAX_BATCHES specifically just yet.
> 

I think I mostly picked a value high enough to make it unlikely to hit
it in realistic cases, while also not using too much memory, and 64
seemed like a good value.

But I don't see why would this have any effect on the prefetch distance,
queue depth etc. Or why decreasing INDEX_SCAN_MAX_BATCHES should improve
that. I'd have expected exactly the opposite behavior.

Could be bug, of course. But it'd be helpful to see the dataset/query.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: simple patch for discussion
Next
From: Michael Paquier
Date:
Subject: Re: Custom pgstat support performance regression for simple queries