Re: AIO v2.3 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: AIO v2.3
Date
Msg-id 4lzvwcwcqms4jdz3lbuvqgo5lypgsqqzs4iggcnw4kdw7ifj4s@z2ewby2sgcbh
Whole thread Raw
In response to Re: AIO v2.3  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: AIO v2.3
Re: AIO v2.3
List pgsql-hackers
Hi,

On 2025-02-11 11:48:38 +1300, Thomas Munro wrote:
> Would the API be better like this?:  When you want to create a batch
> of I/Os submitted together, you wrap the work in pgaio_begin_batch()
> and pgaio_submit_batch(), eg the loop in read_stream_lookahead().

One annoying detail is that an API like this would afaict need resowner
support or something along those lines (e.g. xact callbacks plus code in each
aux process' sigsetjmp() block). Otherwise I don't know how we would ensure
that the "batch-is-in-progress" flag/counter would get reset.

Alternatively we could make pgaio_batch_begin() basically start a critical
section, but that doesn't seem like a good idea, because too much that needs
to happen around buffered IO isn't compatible with critical sections.


Does anybody see a need for batches to be nested? I'm inclined to think that
that would be indicative of bugs and should therefore error/assert out.


One way we could avoid the need for a mechanism to reset-batch-in-progress
would be to make batch submission controlled by a flag on the IO. Something
like
    pgaio_io_set_flag(ioh, PGAIO_HF_BATCH_SUBMIT)

IFF PGAIO_HF_BATCH_SUBMIT is set, the IOs would need to be explicitly
submitted using something like the existing
    pgaio_submit_staged();
(although renaming it to something with batch in the name might be
appropriate)

That way there's no explicit "we are in a batch" state that needs to be reset
in case of errors.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Expanding HOT updates for expression and partial indexes
Next
From: Melanie Plageman
Date:
Subject: Re: pgbench with partitioned tables