Re: Don't synchronously wait for already-in-progress IO in read stream - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Don't synchronously wait for already-in-progress IO in read stream
Date
Msg-id CAAKRu_ZM1epxTdt2=4-g4ff6UC09zne+0xg_gNv3d7LEcxEvNA@mail.gmail.com
Whole thread
In response to Re: Don't synchronously wait for already-in-progress IO in read stream  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Responses Re: Don't synchronously wait for already-in-progress IO in read stream
Re: Don't synchronously wait for already-in-progress IO in read stream
List pgsql-hackers
Thanks for the review!

On Thu, Feb 5, 2026 at 11:56 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> On Sat, 24 Jan 2026 at 00:04, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>
> > - In the 004_read_stream tests, I wonder if there is a way to test
> > that we don't wait for foreign IO until WaitReadBuffers(). We have
> > tests for the stream accessing the same block, which in some cases
> > will exercise the foreign IO path. But it doesn't distinguish between
> > the old behavior -- waiting for the IO to complete when starting read
> > IO on it -- and the new behavior -- not waiting until
> > WaitReadBuffers(). That may not be possible to test, though.
>
> Won't 'stream accessing the same block test' almost always test the
> new behavior (not waiting until WaitReadBuffers())? Having dedicated
> tests for both cases would be helpful, though.

Yea, I was thinking something like testing that if session A is
blocked completing read of block 2 and session B is requesting blocks
2-4 that buffers containing blocks 3 and 4 are valid when session B is
waiting on block 2 to finish.

I started working on something but it needed some new infrastructure
to check if the buffer is valid, and I wanted to see what others
thought first.

I did finally review Andres' test patches and have included my review
feedback here as well.

"aio: Refactor tests in preparation for more tests" (v4-0001) looks
good to me as well. I included one tiny idea AI suggested to me in a
follow-on patch (v4-0002).

> diff --git a/src/test/modules/test_aio/t/004_read_stream.pl
> b/src/test/modules/test_aio/t/004_read_stream.pl
> +foreach my $method (TestAio::supported_io_methods())
> +{
> +    $node->adjust_conf('postgresql.conf', 'io_method', 'worker');
> +    $node->start();
> +    test_io_method($method, $node);
> +    $node->stop();
> +}
>
> This seems wrong, we always test io_method=worker. I think we need to
> replace 'worker' with the $method. Also, we can add check below to the
> test_io_method function in the 004_read_stream.pl:
> ```
>     is($node->safe_psql('postgres', 'SHOW io_method'),
>         $io_method, "$io_method: io_method set correctly");

Good catch. Fixed. I also found a few other small things in this patch
(v4-0003) which I put in v4-0004.

Some ideas I had that I didn't include in v4-0003 because its Andres
patch and is subjective:

For test_repeated_blocks, the first test:

    # test miss of the same block twice in a row
    $psql->query_safe(
        qq/
SELECT evict_rel('largeish');
/);
    $psql->query_safe(
        qq/
SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 2, 2, 4, 4]);
/);
    ok(1, "$io_method: stream missing the same block repeatedly");

It says that it will miss the same block repeatedly, is that because
we won't start a read for any of the blocks until after
read_stream_get_block has returned all of them? If so, could be
clearer in the comment. Not everyone understands all the read stream
internals.

I know a lot of other tests do this, but I find it so hard to read the
test with the SQL is totally left-aligned like that -- especially with
comments interspersed. You can easily flow the queries on multiple
lines and indent it more.

For test_repeated_blocks, the second test:

    # test hit of the same block twice in a row
    $psql->query_safe(
        qq/
SELECT evict_rel('largeish');
/);
    $psql->query_safe(
        qq/
SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 1, 2, 3, 4,
5, 6, 5, 4, 3, 2, 1, 0]);
/);
    ok(1, "$io_method: stream accessing same block");

I assume that the second access of 2 is a hit because we actually did
IO for the first one (unlike in the earlier case)?

For test_inject_foreign:

In general, I am not ramped up enough on injection point stuff to know
if the actual new injection point stuff you added in test_aio.c is is
correct and optimal, but I did review the read stream additions to
test_aio.c and the tests added to 004_read_stream.pl.

For test_inject_foreign, the 3rd test:

    # Test read stream encountering two buffers that are undergoing the same
    # IO, started by another backend

I see that psql_b is requesting 3 blocks which can be combined into 1
IO, which makes it different than the 1st foreign IO test case:

    ###
    # Test read stream encountering buffers undergoing IO in another backend,
    # with the other backend's reads succeeding.
    ###

where psql_b only requests 1 but I don't really see how these are
covering different code. Maybe if the read stream one (psql_a) is
doing a combined IO it might exercise slightly different code, but
otherwise I don't get it.

> Nitpick but I prefer something like TrackBufferHit() or
> CountBufferHit() as a function name instead of ProcessBufferHit().
> ProcessBufferHit() gives the impression that the function is doing a
> job more than it currently does. Other than that, 0004 LGTM.

I changed this in attached v4.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: pg_plan_advice (now with transparent SQL plan performance overrides - pg_stash_advice)
Next
From: Masahiko Sawada
Date:
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts