Re: AIO / read stream heuristics adjustments for index prefetching - Mailing list pgsql-hackers

From Andres Freund
Subject Re: AIO / read stream heuristics adjustments for index prefetching
Date
Msg-id kwzdd2tiow5ai25ehbrsoo6wmiokw5vckjfxle643k6dzskdv6@c2ti7opcwsiv
Whole thread
In response to Re: AIO / read stream heuristics adjustments for index prefetching  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
Hi,

On 2026-04-01 10:52:03 -0400, Melanie Plageman wrote:
> On Tue, Mar 31, 2026 at 12:02 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > 0008: WIP: read stream: Split decision about look ahead for AIO and combining
> >
> >     Until now read stream has used a single look-ahead distance to control
> >     lookahead for both IO combining and read-ahead. That's sub-optimal, as we
> >     want to do IO combining even when we don't need to do any readahead, as
> >     avoiding the syscall overhead is important to reduce CPU overhead when
> >     data is in the kernel page cache.
> >
> >     This is a prototype for what it could look like to split those
> >     decisions. Thereby fixing the regression mentioned in 0006.
>
> I wonder if we need to keep the combine_limit member in the read
> stream. Could we just use io_combine_limit without ramping up and
> down? This is mainly for code complexity reasons.

I thought so at first too, but it unfortunately leads to substantial
regressions with index prefetching, due to reading ahead unnecessarily far in
cases where we really just needed one block.


> Perhaps to allow fast path reentry, we could use distance_decay_holdoff == 0
> and ios_in_progress == 0 instead of combine_distance == 0.

Somewhat orthogonal: I really dislike the code to re-enter fastpath. I've now
broken it a few times without noticing. Especially when using a lower
distance, it's easy for the gating conditions to be fulfilled if
read_stream_look_ahead() decided to not *yet* do look ahead, because there's
still a pinned buffer and the distance is low.

ISTM that it really should only be checked after we did a lookahead and found
it to be a buffer hit.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks
Next
From: "Matheus Alcantara"
Date:
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts