Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring - Mailing list pgsql-hackers

From Robert Treat
Subject Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring
Date
Msg-id CABV9wwOMYXWrCUR4Ga9-X4pJbfwUNfW7iXqWMjZ8rfYGMpabCg@mail.gmail.com
Whole thread Raw
In response to Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
List pgsql-hackers
On Tue, Aug 26, 2025 at 9:32 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> On Tue, Jul 8, 2025 at 5:22 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2025-06-30 12:27:10 -0400, Andres Freund wrote:
> > > On 2025-06-05 14:32:10 -0400, Andres Freund wrote:
> > > > On 2025-06-05 12:47:52 -0400, Tom Lane wrote:
> > > > > Andres Freund <andres@anarazel.de> writes:
> > > > > > I think this is a big enough pitfall that it's, obviously assuming the patch
> > > > > > has a sensible complexity, worth fixing this in 18. RMT, anyone, what do you
> > > > > > think?
> > > > >
> > > > > Let's see the patch ... but yeah, I'd rather not ship 18 like this.
> > > >
> > > > I've attached a first draft.
> > > >
> > > > I can't make heads or tails of the ordering in configure.ac, so the function
> > > > test is probably in the wrong place.
> > >
> > > Any comments on that patch?  I'd hoped for some review comments... Unless I'll
> > > hear otherwise, I'll just do a bit more polish and push..
> >
> > After addressing most of Greg's and Jim's feedback, I pushed this. I chose not
> > to increase the log level as Jim suggested, but if we end up deciding that
> > that's the way to go, we can easily change that...
> >
>
> Hi Andres,
>
> I'm with Jim as I've just hit it but not on exit() but for fork(), so:
>
> 1. Could we s/DEBUG1/INFO/ that debug message level? (for those two:
> "cannot use combined memory mapping for io_uring" , and maybe add
> "potential slow new connections" there too along the way?)
> 2. Maybe we could add some wording to the docs about io_method that it
> might cause such trouble ?
>
> Just wasted an hour on wondering why $stuff is slow, given:
>     max_connections = '20000' # yes, yay..
>     io_method = 'io_uring'
>
> I was getting like slow fork()/clone() performance when there's were
> lots of io_uring fds/instances in the main postmaster:
>     $ /usr/pgsql19/bin/pgbench -f select1.sql -c 1000 -j 1 -t 1 -P 1
>     [..]
>     progress: 39.7 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
>     progress: 40.6 s, 1039.9 tps, lat 407.696 ms stddev 291.856, 0 failed
>     [..]
>     initial connection time = 39632.164 ms
>     tps = 1015.608893 (without initial connection time)
>
> So yes, ~40s to just connect to the database and I was using some old
> branch from back before Jun (it was not having f54af9f2679d5987b46),
> so simulating <= 6.5 as You say more or less. I was limited to 20-30
> forks()/1sec according to bpftrace. It goes away with default
> io_method (~800 forks()/1sec). With max_connections = 2k, I got 5s
> initial connection times. It looked like caused by io_uring, as with
> io_uring fork() was slow somewhere in vma_interval_tree_insert_after
> <- copy_process <- kernel_clone <- __do_sys_clone <- do_syscall_64
> (?). I've tested it on 6.14.17 too, but also on LTS 6.1.x too (well
> the difference is that it takes 65s instead of 40s...). Then searched
> and hit this thread, but 6.1 is the LTS kernel, so plenty of people
> are going to hit those regressions with io_uring io_method, won't
> they?
>
> I can try to prepare a patch, please just let me know.
>

Did anything ever happen with this? I do think it would be helpful to
make some of these pot-holes more user visible / discoverable. I have
a suspicion that we're going to see people using pre-built packages
with io_uring support installed on to older kernels they are still
hanging on to because pg_upgrade was the easiest path, but that they
could either update the kernel or upgrade via logical replication to
get the new functionality if they knew about it.

Robert Treat
https://xzilla.net



pgsql-hackers by date:

Previous
From: Sergey Fukanchik
Date:
Subject: [PATCH] Perform check for oversized WAL record before calculating record CRC
Next
From: Paul Ohlhauser
Date:
Subject: Re: [PG19-3 PATCH] Don't ignore passfile