Re: AIO v2.5 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | cdxulvhcemfcvluc6cttva5f5gthya6jcxrj64yhnmy3rmufe2@7kbk657ixbuu Whole thread Raw |
In response to | Re: AIO v2.5 (Noah Misch <noah@leadboat.com>) |
Responses |
Re: AIO v2.5
|
List | pgsql-hackers |
Hi, On 2025-03-22 19:09:55 -0700, Noah Misch wrote: > On Thu, Mar 20, 2025 at 09:58:37PM -0400, Andres Freund wrote: > > Attached v2.11 > > > Subject: [PATCH v2.11 05/27] aio: Add io_method=io_uring > > Apart from some isolated cosmetic points, this is ready to commit: > > > + ereport(ERROR, > > + errcode(err), > > + errmsg("io_uring_queue_init failed: %m"), > > + hint != NULL ? errhint("%s", hint) : 0); > > https://www.postgresql.org/docs/current/error-style-guide.html gives the example: > > BAD: open() failed: %m > BETTER: could not open file %s: %m > > Hence, this errmsg should change, perhaps to: > "could not setup io_uring queues: %m". You're right. I didn't intentionally "violate" the policy, but I do have to admit, I'm not a huge fan of that aspect, it just obfuscates what actually failed, forcing one to look at the code or strace to figure out what precisely failed. (Changed) > > + pgaio_debug_io(DEBUG3, ioh, > > + "wait_one io_gen: %llu, ref_gen: %llu, cycle %d", > > + (long long unsigned) ref_generation, > > + (long long unsigned) ioh->generation, > > In the message string, io_gen appears before ref_gen. In the subsequent args, > the order is swapped relative to the message string. Oops, you're right. > > --- a/src/backend/utils/activity/wait_event_names.txt > > +++ b/src/backend/utils/activity/wait_event_names.txt > > @@ -192,6 +192,8 @@ ABI_compatibility: > > > > Section: ClassName - WaitEventIO > > > > +AIO_IO_URING_SUBMIT "Waiting for IO submission via io_uring." > > +AIO_IO_URING_COMPLETION "Waiting for IO completion via io_uring." > > AIO_IO_COMPLETION "Waiting for IO completion." > > I'm wondering if there's an opportunity to enrich the last two wait event > names and/or descriptions. The current descriptions suggest to me more > similarity than is actually there. Inputs to the decision: > > - AIO_IO_COMPLETION waits for an IO in PGAIO_HS_DEFINED, PGAIO_HS_STAGED, or > PGAIO_HS_COMPLETED_IO to reach PGAIO_HS_COMPLETED_SHARED. The three > starting states are the states where some other backend owns the next > action, so the current backend can only wait to be signaled. > > - AIO_IO_URING_COMPLETION waits for the kernel to do enough so we can move > from PGAIO_HS_SUBMITTED to PGAIO_HS_COMPLETED_IO. > > Possible names and descriptions, based on PgAioHandleState enum names and > comments: > > AIO_IO_URING_COMPLETED_IO "Waiting for IO result via io_uring." > AIO_COMPLETED_SHARED "Waiting for IO shared completion callback." > > If "shared completion callback" is too internals-focused, perhaps this: > > AIO_IO_URING_COMPLETED_IO "Waiting for IO result via io_uring." > AIO_COMPLETED_SHARED "Waiting for IO completion to update shared memory." Hm, right now AIO_IO_COMPLETION also covers the actual "raw" execution of the IO with io_method=worker/sync. For that AIO_COMPLETED_SHARED would be inappropriate. We could use a different wait event if wait for an IO via CV in PGAIO_HS_SUBMITTED, with a small refactoring of pgaio_io_wait(). But I'm not sure that would get you that far - we don't broadcast the CV when transitioning from PGAIO_HS_SUBMITTED -> PGAIO_HS_COMPLETED_IO, so the wait event would stay the same, now wrong, wait event until the shared callback completes. Obviously waking everyone up just so they can use a differen wait event doesn't make sense. A more minimal change would be to narrow AIO_IO_URING_COMPLETION to "execution" or something like that, to hint at a separation between the raw IO being completed and the IO, including the callbacks completing. > > --- a/doc/src/sgml/config.sgml > > +++ b/doc/src/sgml/config.sgml > > @@ -2710,6 +2710,12 @@ include_dir 'conf.d' > > <literal>worker</literal> (execute asynchronous I/O using worker processes) > > </para> > > </listitem> > > + <listitem> > > + <para> > > + <literal>io_uring</literal> (execute asynchronous I/O using > > + io_uring, if available) > > I feel the "if available" doesn't quite fit, since we'll fail if unavailable. > Maybe just "(execute asynchronous I/O using Linux io_uring)" with "Linux" > there to reduce surprise on other platforms. You're right, the if available can be misunderstood. But not mentioning that it's an optional dependency seems odd too. What about something like <para> <literal>io_uring</literal> (execute asynchronous I/O using io_uring, requires postgres to have been built with <link linkend="configure-option-with-liburing"><option>--with-liburing</option></link> / <link linkend="configure-with-liburing-meson"><option>-Dliburing</option></link>) </para> Should the docs for --with-liburing/-Dliburing mention it's linux only? We don't seem to do that for things like systemd (linux), selinux (linux) and only kinda for bonjour (macos). Greetings, Andres Freund
pgsql-hackers by date: