Re: AIO v2.5 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | djy6kd673fj4ked5jb2itksixceoog2evrpgk5xglaflkglmaw@pmk3oyeui2cj Whole thread Raw |
In response to | Re: AIO v2.5 (Noah Misch <noah@leadboat.com>) |
Responses |
Re: AIO v2.5
|
List | pgsql-hackers |
Hi, On 2025-03-23 08:55:29 -0700, Noah Misch wrote: > On Sun, Mar 23, 2025 at 11:11:53AM -0400, Andres Freund wrote: > Unrelated to the above, another question about io_uring: > > commit da722699 wrote: > > +/* > > + * Need to submit staged but not yet submitted IOs using the fd, otherwise > > + * the IO would end up targeting something bogus. > > + */ > > +void > > +pgaio_closing_fd(int fd) > > An IO in PGAIO_HS_STAGED clearly blocks closing the IO's FD, and an IO in > PGAIO_HS_COMPLETED_IO clearly doesn't block that close. For io_method=worker, > closing in PGAIO_HS_SUBMITTED is okay. For io_method=io_uring, is there a > reference about it being okay to close during PGAIO_HS_SUBMITTED? I looked > awhile for an authoritative view on that, but I didn't find one. If we can > rely on io_uring_submit() returning only after the kernel has given the > io_uring its own reference to all applicable file descriptors, I expect it's > okay to close the process's FD. If the io_uring acquires its reference later > than that, I expect we shouldn't close before that later time. I'm fairly sure io_uring has its own reference for the file descriptor by the time io_uring_enter() returns [1]. What io_uring does *not* reliably tolerate is the issuing process *exiting* before the IO completes, even if there are other processes attached to the same io_uring instance. AIO v1 had a posix_aio backend, which, on several platforms, did *not* tolerate the FD being closed before the IO completes. Because of that IoMethodOps had a closing_fd callback, which posix_aio used to wait for the IO's completion [2]. I've added a test case exercising this path for all io methods. But I can't think of a way that would catch io_uring not actually holding a reference to the fd with a high likelihood - the IO will almost always complete quickly enough to not be able to catch that. But it still seems better than not at all testing the path - it does catch at least the problem of pgaio_closing_fd() not doing anything. Greetings, Andres Freund [1] See https://github.com/torvalds/linux/blob/586de92313fcab8ed84ac5f78f4d2aae2db92c59/io_uring/io_uring.c#L1728 called from https://github.com/torvalds/linux/blob/586de92313fcab8ed84ac5f78f4d2aae2db92c59/io_uring/io_uring.c#L2204 called from https://github.com/torvalds/linux/blob/586de92313fcab8ed84ac5f78f4d2aae2db92c59/io_uring/io_uring.c#L3372 in the io_uring_enter() syscall [2] https://github.com/anarazel/postgres/blob/a08cd717b5af4e51afb25ec86623973158a72ab9/src/backend/storage/aio/aio_posix.c#L738
pgsql-hackers by date: