Re: AIO v2.0 - Mailing list pgsql-hackers

From Jakub Wartak
Subject Re: AIO v2.0
Date
Msg-id CAKZiRmzE14tg44k8J-Yc051Mmdd-ZjBn8dFh=RYiTseXzjSRHA@mail.gmail.com
Whole thread Raw
In response to Re: AIO v2.0  (Andres Freund <andres@anarazel.de>)
Responses Re: AIO v2.0
List pgsql-hackers
On Mon, Jan 6, 2025 at 5:28 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2024-12-19 17:29:12 -0500, Andres Freund wrote:
> > > Not about patch itself, but questions about related stack functionality:
> > > ----------------------------------------------------------------------------------------------------
> > >
> > >
> > > 7. Is pg_stat_aios still on the table or not ? (AIO 2021 had it). Any hints
> > > on how to inspect real I/O calls requested to review if the code is issuing
> > > sensible calls: there's no strace for uring, or do you stick to DEBUG3 or
> > > perhaps using some bpftrace / xfsslower is the best way to go ?
> >
> > I think we still want something like it, but I don't think it needs to be in
> > the initial commits.
>
> After I got this question from Thomas as well, I started hacking one up.
>
> What information would you like to see?
>
> Here's what I currently have:
..
> ├─[ RECORD 2 ]───┼────────────────────────────────────────────────┤
> │ pid            │ 358212                                         │
> │ io_id          │ 2051                                           │
> │ io_generation  │ 4199                                           │
> │ state          │ IN_FLIGHT                                      │
> │ operation      │ read                                           │
> │ offset         │ 511967232                                      │
> │ length         │ 262144                                         │
> │ subject        │ smgr                                           │
> │ iovec_data_len │ 32                                             │
> │ raw_result     │ (null)                                         │
> │ result         │ UNKNOWN                                        │
> │ error_desc     │ (null)                                         │
> │ subject_desc   │ blocks 1373216..1373247 in file "base/5/16388" │
> │ flag_sync      │ f                                              │
> │ flag_localmem  │ f                                              │
> │ flag_buffered  │ t                                              │

Cool! It's more than enough for me in future, thanks!

> I didn't think that pg_stat_* was quite the right namespace, given that it
> shows not stats, but the currently ongoing IOs.  I am going with pg_aios for
> now, but I don't particularly like that.

If you are looking for other proposals:
* pg_aios_progress ? (to follow pattern of pg_stat_copy|vaccuum_progress?)
* pg_debug_aios ?
* pg_debug_io ?

> I think we'll want a pg_stat_aio as well, tracking things like:
>
> - how often the queue to IO workes was full
> - how many times we submitted IO to the kernel (<= #ios with io_uring)
> - how many times we asked the kernel for events (<= #ios with io_uring)
> - how many times we had to wait for in-flight IOs before issuing more IOs

If I could dream of one thing that would be 99.9% percentile of IO
response times in milliseconds for different classes of I/O traffic
(read/write/flush). But it sounds like it would be very similiar to
pg_stat_io and potentially would have to be
per-tablespace/IO-traffic(subject)-type too. AFAIU pg_stat_io has
improper structure to have that there.

BTW: before trying to even start to compile that AIO v2.2* and
responding to the previous review, what are You looking interested to
hear the most about it so that it adds some value ? Any workload
specific measurements? just general feedback, functionality gaps?
Integrity/data testing with stuff like dm-dust, dm-flakey, dm-delay to
try the error handling routines? Some kind of AIO <-> standby/recovery
interactions?

* - btw, Date: 2025-01-01 04:03:33 - I saw what you did there! so
let's officially recognize the 2025 as the year of AIO in PG, as it
was 1st message :D

-J.



pgsql-hackers by date:

Previous
From: Diego Fronza
Date:
Subject: Logical replication - proposal for a custom conflict resolution function
Next
From: Bertrand Drouvot
Date:
Subject: Re: Reorder shutdown sequence, to flush pgstats later