Re: index prefetching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: index prefetching
Date
Msg-id ffcc1472-604e-48cf-afd1-25a245d3df4a@vondra.me
Whole thread Raw
In response to Re: index prefetching  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: index prefetching
Re: index prefetching
List pgsql-hackers

On 8/28/25 23:50, Thomas Munro wrote:
> On Fri, Aug 29, 2025 at 7:52 AM Andres Freund <andres@anarazel.de> wrote:
>> On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote:
>>> From the 2x regression (compared to master) it might seem like that, but
>>> even with the increased distance it's still slower than master (by 25%). So
>>> maybe the "error" is to use AIO in these cases, instead of just switching to
>>> I/O done by the backend.
>>
>> If it's slower at a higher distance, we're missing something.
> 
> Enough io_workers?  What kind of I/O concurrency does it want?  Does
> wait_event show any backends doing synchronous IO?  How many does [1]
> want to run for that test workload and does it help?
> 

I'm not sure how to determine what concurrency it "wants". All I know is
that for "warm" runs [1], the basic index prefetch patch uses distance
~2.0 on average, and is ~2x slower than master. And with the patches the
distance is ~270, and it's 30% slower than master. (IIRC there's about
30% misses, so 270 is fairly high. Can't check now, the machine is
running other tests.)

Not sure about wait events, but I don't think any backends are doing
sychnronous I/O. There's only that one query running, and it's using AIO
(except for the index, which is still read synchronously).

Likewise, I don't think there's insufficient number of workers. I've
tried with 3 and 12 workers, and there's virtually no difference between
those. IIRC when watching "top", I've never seen more than 1 or maybe 2
workers active (using CPU).

[1] https://www.postgresql.org/message-id/attachment/180630/ryzen-warm.pdf

[2]
https://www.postgresql.org/message-id/293a4735-79a4-499c-9a36-870ee9286281%40vondra.me

> FWIW there's a very simple canned latency test in a SQL function in
> the first message in that thread (0005-XXX-read_buffer_loop.patch),
> just on the off-chance that it's useful as a starting point for other
> ideas.  There I was interested in IPC overheads, latch collapsing and
> other effects, so I was deliberately stalling on/evicting a single
> block repeatedly without any readahead distance, so I wasn't letting
> the stream "hide" IPC overheads.
> 
> [1]
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2Bm4xV0LMoH2c%3DoRAdEXuCnh%2BtGBTWa7uFeFMGgTLAw%2BQ%40mail.gmail.com

Interesting, I'll give it a try tomorrow. Do you recall if the results
were roughly in line with results of my signal IPC test?


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Improve LWLock tranche name visibility across backends
Next
From: Peter Geoghegan
Date:
Subject: Re: index prefetching