Re: [HACKERS] kqueue - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] kqueue
Date
Msg-id CA+hUKGLDfs-tcEYdOG6+7cFkGnuWNmVTJxri0MB=CE93aQNP_Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] kqueue  (Rui DeSousa <rui@crazybean.net>)
Responses Re: [HACKERS] kqueue  (Thomas Munro <thomas.munro@gmail.com>)
Re: [HACKERS] kqueue  (Mark Wong <mark@2ndQuadrant.com>)
List pgsql-hackers
On Thu, Jan 23, 2020 at 9:38 AM Rui DeSousa <rui@crazybean.net> wrote:
> On Jan 22, 2020, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's certainly possible that to see any benefit you need stress
>> levels above what I can manage on the small box I've got these
>> OSes on.  Still, it'd be nice if a performance patch could show
>> some improved performance, before we take any portability risks
>> for it.

You might need more than one CPU socket, or at least lots more cores
so that you can create enough contention.  That was needed to see the
regression caused by commit ac1d794 on Linux[1].

> Here is two charts comparing a patched and unpatched system.
> These systems are very large and have just shy of thousand
> connections each with averages of 20 to 30 active queries concurrently
> running at times including hundreds if not thousand of queries hitting
> the database in rapid succession.  The effect is the unpatched system
> generates a lot of system load just handling idle connections where as
> the patched version is not impacted by idle sessions or sessions that
> have already received data.

Thanks.  I can reproduce something like this on an Azure 72-vCPU
system, using pgbench -S -c800 -j32.  The point of those settings is
to have many backends, but they're all alternating between work and
sleep.  That creates a stream of poll() syscalls, and system time goes
through the roof (all CPUs pegged, but it's ~half system).  Profiling
the kernel with dtrace, I see the most common stack (by a long way) is
in a poll-related lock, similar to a profile Rui sent me off-list from
his production system.  Patched, there is very little system time and
the TPS number goes from 539k to 781k.

[1]
https://www.postgresql.org/message-id/flat/CAB-SwXZh44_2ybvS5Z67p_CDz%3DXFn4hNAD%3DCnMEF%2BQqkXwFrGg%40mail.gmail.com



pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: [UNVERIFIED SENDER] Re: Add MAIN_RELATION_CLEANUP andSECONDARY_RELATION_CLEANUP options to VACUUM
Next
From: Jesse Zhang
Date:
Subject: Re: Parallel grouping sets