Yes, I ran the query after a couple of minutes. Those are the
steady-state numbers.
Also 'top' shows:
top - 22:44:26 up 12 days, 23:14, 5 users, load average: 20.99, 21.35,
19.27
Tasks: 859 total, 26 running, 539 sleeping, 0 stopped, 0 zombie
%Cpu(s): 34.3 us, 1.6 sy, 0.0 ni, 64.1 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 24742353+total, 33723356 free, 73160656 used,
14053952+buff/cache
KiB Swap: 0 total, 0 free, 0 used. 17132937+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
30070 postgres 20 0 41.232g 28608 18192 R 53.8 0.0 17:35.74
postgres
30087 postgres 20 0 41.233g 28408 18180 S 53.8 0.0 17:35.69
postgres
30055 postgres 20 0 41.233g 28492 18120 R 53.5 0.0 17:41.51
postgres
Note the postgres processes only running at 53% with the system at 64%
idle. The 1.7% system time seems indicative of the spinlocks blocking the
major processing.
Do you know what resource the LockManager might be blocking on/protecting
with these LWlocks?
Also, I didn't understand your comment about a 'futex profile', could you
point me in the right direction here?
Thanks.
---Paul
Paul Friedman
CTO
677 Harrison St | San Francisco, CA 94107
M: (650) 270-7676
E-mail: paul.friedman@streetlightdata.com
-----Original Message-----
From: Andres Freund <andres@anarazel.de>
Sent: Monday, April 12, 2021 3:22 PM
To: Paul Friedman <paul.friedman@streetlightdata.com>
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: LWLocks by LockManager slowing large DB
Hi,
On 2021-04-12 15:15:05 -0700, Paul Friedman wrote:
> Thanks again for any advice you have.
I think we'd need the perf profiles to be able to dig into this further.
It's odd that there are a meaningful amount of LockManager contention in
your case - assuming the stats you collected weren't just the first few
milliseconds of starting those 60 queries, there shouldn't be any
additional "heavyweight locks" taken given the duration of your queries.
The futex profile hopefully will tell us from where that is coming from...
Greetings,
Andres Freund