On Wed, Sep 3, 2025 at 8:16 PM Andres Freund <andres@anarazel.de> wrote:
> > I don't see that level of improvement with DIO. For me it's 6054.921
> > ms with prefetching, 8766.287 ms without it.
>
> I guess your SSD has lower latency than mine...
It's nothing special: a 4 year old Samsung 980 pro.
> This actually might be the thing to tackle to avoid this and other similar
> regressions: If we were able to isssue combined IOs for interspersed patterns
> like we have in this query, we'd easily win back the overhead. And it'd make
> DIO much much better.
That sounds very plausible to me. I don't think it's at all unusual
for index scans to do this (that particular aspect of the test case
query wasn't unrealistic). In general this seems important to me.
> I don't quite know if this is best done as an optional feature for read
> streams, a layer atop read stream or something dedicated.
My guess is that it would work best as an optional feature for read
streams. A flag like READ_STREAM_REPEAT_READS that's passed to
read_stream_begin_relation might work best.
> For now I'll go back to working on read stream test infrastructure. That's the
> prerequisite for testing the "don't synchronously wait for in-progress IO"
> improvement.
"don't synchronously wait for in-progress IO" is also very important
to this project. Thanks for your help with that.
> And if we want to have more complicated merging, that also seems
> like something much easier to develop with some testing infra.
Great.
--
Peter Geoghegan