On Tue, Mar 31, 2026 at 12:02 PM Andres Freund <andres@anarazel.de> wrote:
>
> 0001+0002: Return whether WaitReadBuffers() needed to wait
>
> The first patch allows pgaio_wref_check_done() to work more reliably with
> io_uring. Until now it only was able to return true if userspace already
> had consumed the kernel's completion event, but returned false otherwise.
> That's not really incorrect, just suboptimal.
>
> The second patch returns whether WaitReadBuffers() needed to wait for
> IO. This is useful for a) instrumentation like in [2] and b) to provide
> information to the read_stream heuristics to control how aggressive to
> perform read ahead.
These both look good to me except in 0001 you left an XXX in
pgaio_wref_check_done() that I think is the very thing that commit
does.
> 0003: read_stream: Issue IO synchronously while in fast path
LGTM
> 0004: read_stream: Prevent distance from decaying too quickly
>
> There are two minor questions here:
> - Should read_stream_pause()/read_stream_resume() restore the "holdoff"
> counter? I doubt it matters for the prospective user, since it will
> only be used when the lookahead distance is very large.
I don't really understand this. We have to do this with distance
because we set it to 0 and use distance == 0 to indicate stream end.
read_stream_pause() doesn't set the distance_decay_holoff to 0. If you
mean, should we reset holdoff to its initial value, then I don't think
so. I imagine that users doing a lot of pause and resume may not have
high distance.
> - For how long to hold off distance reductions? Initially I was torn
> between using "max_pinned_buffers" (Min(max_ios * io_combine_limit,
> cap)) and "max_ios" ([maintenance_]effective_io_concurrency). But I
> think the former makes more sense, as we otherwise won't allow for far
> enough readahead when doing IO combining, and it does seem to make sense
> to hold off decay for long enough that the maximum lookahead could not
> theoretically allow us to start an IO.
I agree. 0004 LGTM otherwise.
- Melanie