Thread: Re: AIO v2.4

Re: AIO v2.4

From
Andres Freund
Date:
Hi,

Attached is v2.4 of the AIO patchset.

Changes:

- Introduce "batchmode", while not in batchmode, IOs get submitted immediately.

  Thomas didn't like how this worked previously, and while this was a
  surprisingly large amount of work, I agree that it looks better now.

  I vaccilated a bunch on the naming. For now it's

  extern void pgaio_enter_batchmode(void);
  extern void pgaio_exit_batchmode(void);

  I did adjust the README and wrote a reasonably long comment above enter:

https://github.com/anarazel/postgres/blob/a324870186ddff9a31b10472b790eb4e744c40b3/src/backend/storage/aio/aio.c#L931-L960


- Batchmode needs to be exited in case of errors, for that

  - a new pgaio_after_error() call has been added to all the relevant places

  - xact.c calls to aio have been (re-)added to check that there are no
    in-progress batches / unsubmitted IOs at the end of a transaction.

    Before that I had just removed at-eoxact "callbacks" :)

    This checking has holes though:
    https://postgr.es/m/upkkyhyuv6ultnejrutqcu657atw22kluh4lt2oidzxxtjqux3%40a4hdzamh4wzo

    Because this only means that we will not detect all buggy code, rather
    than misbehaving for correct code, I think this may be ok for now.


- Renamed aio_init.h to aio_subsys.h

  The newly added pgaio_after_error() calls would have required including
  aio.h in a good bit more places that won't themselves issue AIO. That seemed
  wrong.  There already was a aio_init.h to avoid needing to include aio.h in
  places like ipci.c, but it seemed wrong to put pgaio_after_error() in
  aio_init.h.  So I renamed it to aio_subsys.h - not sure that's the best
  name, but I can live with it.


- Now that Thomas submitted the necessary read_stream.c improvements, the
  prior big TODO about one StartReadBuffers() call needing to start many IOs
  has been addressed.

  Thomas' thread: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com

  For now I've also included Thomas patches in my queue, but they should get
  pushed independently.  Review comments specific to those patches probably
  are better put on the other thread.

  Thomas' patches also fix several issues that were addressed in my WIP
  adjustments to read_stream.c.  There are a few left, but it does look
  better.

  The included commits are 0003-0008.


- I rewrote the tests into a tap test. That was exceedingly painful. Partially
  due to tap infrastructure bugs on windows that would sometimes cause
  inscrutable failures, see
  https://www.postgresql.org/message-id/wmovm6xcbwh7twdtymxuboaoarbvwj2haasd3sikzlb3dkgz76%40n45rzycluzft

  I just pushed that fix earlier today.


- Added docs for new GUCs, moved them to a more appropriate section

  See also https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc%40lguk43z3bsyq


- If IO workers fail to reopen the file for an IO, the IO is now marked as
  failed. Previously we'd just hang.

  To test this I added an injection point that triggers the failure. I don't
  know how else this could be tested.


- Added liburing dependency build documentation


- Added check hook to ensure io_max_concurrency = isn't set to 0 (-1 is for
  auto-config)


- Fixed that with io_method == sync we'd issue fadvise calls when not
  appropriate, that was a consequence of my hacky read_stream.c changes.


- Renamed some the aio<->bufmgr.c interface functions. Don't think they're
  quite perfect, but they're in later patches, so I don't want to focus too
  much on them rn.


- Comment improvements etc.


- Got rid of an unused wait event and renamed other wait events to make more
  sense.


- Previously the injection points were added as part of the test patch, I now
  moved them into the commits adding the code being tested. Was too annoying
  to edit otherwise.


Todo:

- there's a decent amount of FIXMEs in later commits related to ereport(LOG)s
  needing relpath() while in a critical section.  I did propose a solution to
  that yesterday:

  https://postgr.es/m/h3a7ftrxypgxbw6ukcrrkspjon5dlninedwb5udkrase3rgqvn%403cokde6btlrl

- A few more corner case tests for the interaction of multiple backends trying
  to do IO on overlapping buffers would be good.

- Our temp table test coverage is atrociously bad


Questions:

- The test module requires StartBufferIO() to be visible outside of bufmgr.c -
  I think that's ok, would be good to know if others agree.


I'm planning to push the first two commits soon, I think they're ok on their
own, even if nothing else were to go in.


Greetings,

Andres Freund

Attachment

Re: AIO v2.4

From
Andres Freund
Date:
Hi,

On 2025-02-19 14:10:44 -0500, Andres Freund wrote:
> I'm planning to push the first two commits soon, I think they're ok on their
> own, even if nothing else were to go in.

I did that for the lwlock patch.

But I think I might not do the same for the "Ensure a resowner exists for all
paths that may perform AIO" patch. The paths for which we are missing
resowners are concerned WAL writes - but it'll be a while before we get
AIO WAL writes.

It'd be fairly harmless to do this change before, but I found the justifying
code comments hard to rephrase. E.g.:

--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -361,8 +361,15 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
     BaseInit();
 
     bootstrap_signals();
+
+    /* need a resowner for IO during BootStrapXLOG() */
+    CreateAuxProcessResourceOwner();
+
     BootStrapXLOG(bootstrap_data_checksum_version);
 
+    ReleaseAuxProcessResources(true);
+    CurrentResourceOwner = NULL;
+
     /*
      * To ensure that src/common/link-canary.c is linked into the backend, we
      * must call it from somewhere.  Here is as good as anywhere.

Given that there's no use of resowners inside BootStrapXLOG() today and not
for the next months it seems confusing?


Greetings,

Andres Freund