Re: Don't synchronously wait for already-in-progress IO in read stream - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Don't synchronously wait for already-in-progress IO in read stream
Date
Msg-id CAAKRu_YV0D_9wWgGkHdWgQsooqmcc08Uqb3tqxzrwcDXHajDtA@mail.gmail.com
Whole thread
In response to Re: Don't synchronously wait for already-in-progress IO in read stream  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Responses Re: Don't synchronously wait for already-in-progress IO in read stream
List pgsql-hackers
Thanks for the continued review!

Attached v5 adds some comments to the tests, fixes a few nits in the
actual code, and adds a commit to fix what I think is an existing
off-by-one error in TRACE_POSTGRESQL_BUFFER_READ_DONE.

On Fri, Mar 6, 2026 at 8:18 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> > For test_repeated_blocks, the first test:
> >
> >     # test miss of the same block twice in a row
> >     $psql->query_safe(
> >         qq/
> > SELECT evict_rel('largeish');
> > /);
> >     $psql->query_safe(
> >         qq/
> > SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 2, 2, 4, 4]);
> > /);
> >     ok(1, "$io_method: stream missing the same block repeatedly");
> >
> > It says that it will miss the same block repeatedly, is that because
> > we won't start a read for any of the blocks until after
> > read_stream_get_block has returned all of them? If so, could be
> > clearer in the comment. Not everyone understands all the read stream
> > internals.
>
> I think we start a read of blocks because we hit stream->distance but
> it doesn't affect any consecutive same block numbers. What I
> understood is:
>
> Since io_combine_limit is 1, there won't be any IO combining.
>
> 0th block (0), miss, distance is 1; StartReadBuffersImpl() and
> WaitReadBuffers() are called for 0th block.
> 1th block (2), miss, distance is 2, StartReadBuffersImpl() is called.
> 2th block (2), miss, distance is 2, StartReadBuffersImpl() and
> WaitReadBuffers() are called 1th block.
> 3th block (4), miss, distance is 4, StartReadBuffersImpl() is called.
> 4th block (4), miss, distance is 4, StartReadBuffersImpl() and
> WaitReadBuffers() are called 2, 3 and 4th blocks.

Makes sense. I've tried to add a comment to this effect.

> > I know a lot of other tests do this, but I find it so hard to read the
> > test with the SQL is totally left-aligned like that -- especially with
> > comments interspersed. You can easily flow the queries on multiple
> > lines and indent it more.
>
> I agree with you.

I did reflow the SQL. It does mean there will be a bunch of extra
whitespace sent to the server. Other tests do this, though. I wonder
how much it affects performance...

> > For test_repeated_blocks, the second test:
> >
> >     # test hit of the same block twice in a row
> >     $psql->query_safe(
> >         qq/
> > SELECT evict_rel('largeish');
> > /);
> >     $psql->query_safe(
> >         qq/
> > SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 1, 2, 3, 4,
> > 5, 6, 5, 4, 3, 2, 1, 0]);
> > /);
> >     ok(1, "$io_method: stream accessing same block");
> >
> > I assume that the second access of 2 is a hit because we actually did
> > IO for the first one (unlike in the earlier case)?
>
> I think so but to clarify, all second access of [2, 1, 0] blocks are hit; right?

Yes. I tried expanding the comment to elaborate, but it just came out
awkward, so I left it the way it is.

> > For test_inject_foreign, the 3rd test:
> >
> >     # Test read stream encountering two buffers that are undergoing the same
> >     # IO, started by another backend
> >
> > I see that psql_b is requesting 3 blocks which can be combined into 1
> > IO, which makes it different than the 1st foreign IO test case:
> >
> >     ###
> >     # Test read stream encountering buffers undergoing IO in another backend,
> >     # with the other backend's reads succeeding.
> >     ###
> >
> > where psql_b only requests 1 but I don't really see how these are
> > covering different code. Maybe if the read stream one (psql_a) is
> > doing a combined IO it might exercise slightly different code, but
> > otherwise I don't get it.
>
> I think the main difference is that:
>
> >     ###
> >     # Test read stream encountering buffers undergoing IO in another backend,
> >     # with the other backend's reads succeeding.
> >     ###
>
> SELECT array_agg(blocknum) FROM read_stream_for_blocks('largeish',
> ARRAY[0, 2, 5, 7]);
>
> We need to join waiting block number 5 and then start another IO for
> block number 7.
>
> >     # Test read stream encountering two buffers that are undergoing the same
> >     # IO, started by another backend
>
> SELECT array_agg(blocknum) FROM read_stream_for_blocks('largeish',
> ARRAY[0, 2, 4]);
>
> We need to join waiting block number 2 but after waiting for an IO, IO
> for block number 4 should be already completed too. We don't need to
> start IO like the other case.

Ah, makes sense. Thanks!

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Michael Paquier
Date:
Subject: Re: Add starelid, attnum to pg_stats and leverage this in pg_dump