Re: index prefetching - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: index prefetching
Date
Msg-id CA+hUKGKFcy2WUWwU+M=SB7LiMA97av6VxN2LRBJPFg1wwh-psQ@mail.gmail.com
Whole thread Raw
In response to Re: index prefetching  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
On Thu, Aug 14, 2025 at 9:19 AM Tomas Vondra <tomas@vondra.me> wrote:
> I did investigate this, and I don't think there's anything broken in
> read_stream. It happens because ReadStream has a concept of "ungetting"
> a block, which can happen after hitting some I/O limits.
>
> In that case we "remember" the last block (in read_stream_look_ahead
> calls read_stream_unget_block), and we return it again. It may seem as
> if read_stream_get_block() produced the same block twice, but it's
> really just the block from the last round.

Yeah, it's a bit of a tight corner in the algorithm, and I haven't
found any better solution.  It arises from this circularity:

* we need a block number from the callback before we can decide if it
can be combined with the pending read
* if we can't combine it, we need to start the pending read to get it
out of the way, so we can start a new one
* we entered this path knowing that we are allowed to start one more
IO, but if doing so reports a spit then we've only made the pending
read smaller, ie the tail portion remains, so we still can't combine
with it, so the only way to make progress is to loop and start another
IO, and so on
* while doing that we might hit the limits on pinned buffers (only for
tiny buffer pools) or (more likely) running IOs, and then what are you
going to do with that block number?



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Silence Valgrind about SelectConfigFiles()
Next
From: Tomas Vondra
Date:
Subject: Re: index prefetching