Re: AIO v2.0 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: AIO v2.0
Date
Msg-id 6vjl6jeaqvyhfbpgwziypwmhem2rwla4o5pgpuxwtg3o3o3jb5@evyzorb5meth
Whole thread Raw
In response to Re: AIO v2.0  (Andres Freund <andres@anarazel.de>)
Responses Re: AIO v2.0
List pgsql-hackers
Hi,

On 2024-12-19 17:29:12 -0500, Andres Freund wrote:
> > Not about patch itself, but questions about related stack functionality:
> > ----------------------------------------------------------------------------------------------------
> >
> >
> > 7. Is pg_stat_aios still on the table or not ? (AIO 2021 had it). Any hints
> > on how to inspect real I/O calls requested to review if the code is issuing
> > sensible calls: there's no strace for uring, or do you stick to DEBUG3 or
> > perhaps using some bpftrace / xfsslower is the best way to go ?
> 
> I think we still want something like it, but I don't think it needs to be in
> the initial commits.

After I got this question from Thomas as well, I started hacking one up.

What information would you like to see?

Here's what I currently have:
┌─[ RECORD 1 ]───┬────────────────────────────────────────────────┐
│ pid            │ 358212                                         │
│ io_id          │ 2050                                           │
│ io_generation  │ 4209                                           │
│ state          │ COMPLETED_SHARED                               │
│ operation      │ read                                           │
│ offset         │ 509083648                                      │
│ length         │ 262144                                         │
│ subject        │ smgr                                           │
│ iovec_data_len │ 32                                             │
│ raw_result     │ 262144                                         │
│ result         │ OK                                             │
│ error_desc     │ (null)                                         │
│ subject_desc   │ blocks 1372864..1372895 in file "base/5/16388" │
│ flag_sync      │ f                                              │
│ flag_localmem  │ f                                              │
│ flag_buffered  │ t                                              │
├─[ RECORD 2 ]───┼────────────────────────────────────────────────┤
│ pid            │ 358212                                         │
│ io_id          │ 2051                                           │
│ io_generation  │ 4199                                           │
│ state          │ IN_FLIGHT                                      │
│ operation      │ read                                           │
│ offset         │ 511967232                                      │
│ length         │ 262144                                         │
│ subject        │ smgr                                           │
│ iovec_data_len │ 32                                             │
│ raw_result     │ (null)                                         │
│ result         │ UNKNOWN                                        │
│ error_desc     │ (null)                                         │
│ subject_desc   │ blocks 1373216..1373247 in file "base/5/16388" │
│ flag_sync      │ f                                              │
│ flag_localmem  │ f                                              │
│ flag_buffered  │ t                                              │


I didn't think that pg_stat_* was quite the right namespace, given that it
shows not stats, but the currently ongoing IOs.  I am going with pg_aios for
now, but I don't particularly like that.


I think we'll want a pg_stat_aio as well, tracking things like:

- how often the queue to IO workes was full
- how many times we submitted IO to the kernel (<= #ios with io_uring)
- how many times we asked the kernel for events (<= #ios with io_uring)
- how many times we had to wait for in-flight IOs before issuing more IOs

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)
Next
From: David Steele
Date:
Subject: Re: Fwd: Re: A new look at old NFS readdir() problems?