Re: Streaming I/O, vectored I/O (WIP) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Streaming I/O, vectored I/O (WIP)
Date
Msg-id 56bc4ac4-9a9e-4f28-af1e-b6d537bd9a5e@iki.fi
Whole thread Raw
In response to Re: Streaming I/O, vectored I/O (WIP)  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Streaming I/O, vectored I/O (WIP)
List pgsql-hackers
Some quick comments:

On 12/03/2024 15:02, Thomas Munro wrote:
> src/backend/storage/aio/streaming_read.c
> src/include/storage/streaming_read.h

Standard file header comments missing.

It would be nice to have a comment at the top of streaming_read.c, 
explaining at a high level how the circular buffer, lookahead and all 
that works. Maybe even some diagrams.

For example, what is head and what is tail? Before reading the code, I 
assumed that 'head' was the next block range to return in 
pg_streaming_read_buffer_get_next(). But I think it's actually the other 
way round?

> /*
>  * Create a new streaming read object that can be used to perform the
>  * equivalent of a series of ReadBuffer() calls for one fork of one relation.
>  * Internally, it generates larger vectored reads where possible by looking
>  * ahead.
>  */
> PgStreamingRead *
> pg_streaming_read_buffer_alloc(int flags,
>                                void *pgsr_private,
>                                size_t per_buffer_data_size,
>                                BufferAccessStrategy strategy,
>                                BufferManagerRelation bmr,
>                                ForkNumber forknum,
>                                PgStreamingReadBufferCB next_block_cb)

I'm not a fan of the name, especially the 'alloc' part. Yeah, most of 
the work it does is memory allocation. But I'd suggest something like 
'pg_streaming_read_begin' instead.

Do we really need the pg_ prefix in these?

> Buffer
> pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)

Maybe 'pg_streaming_read_next_buffer' or just 'pg_streaming_read_next', 
for a shorter name.


> 
>     /*
>      * pgsr->ranges is a circular buffer.  When it is empty, head == tail.
>      * When it is full, there is an empty element between head and tail.  Head
>      * can also be empty (nblocks == 0), therefore we need two extra elements
>      * for non-occupied ranges, on top of max_pinned_buffers to allow for the
>      * maxmimum possible number of occupied ranges of the smallest possible
>      * size of one.
>      */
>     size = max_pinned_buffers + 2;

I didn't understand this explanation for why it's + 2.

>     /*
>      * Skip the initial ramp-up phase if the caller says we're going to be
>      * reading the whole relation.  This way we start out doing full-sized
>      * reads.
>      */
>     if (flags & PGSR_FLAG_FULL)
>         pgsr->distance = Min(MAX_BUFFERS_PER_TRANSFER, pgsr->max_pinned_buffers);
>     else
>         pgsr->distance = 1;

Should this be "Max(MAX_BUFFERS_PER_TRANSFER, 
pgsr->max_pinned_buffers)"? max_pinned_buffers cannot be smaller than 
MAX_BUFFERS_PER_TRANSFER though, given how it's initialized earlier. So 
perhaps just 'pgsr->distance = pgsr->max_pinned_buffers' ?

-- 
Heikki Linnakangas
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Possibility to disable `ALTER SYSTEM`
Next
From: Alexander Korotkov
Date:
Subject: Re: Read data from Postgres table pages