RE: LWLocks by LockManager slowing large DB - Mailing list pgsql-performance

From Paul Friedman
Subject RE: LWLocks by LockManager slowing large DB
Date
Msg-id fe22f61a175c68f2a52ae0f63eab57d5@mail.gmail.com
Whole thread Raw
In response to Re: LWLocks by LockManager slowing large DB  (Andres Freund <andres@anarazel.de>)
Responses Re: LWLocks by LockManager slowing large DB
List pgsql-performance

Thanks for the quick reply!

 

These queries take ~1hr and are the only thing running on the system (all 60 are launched at the same time and the tables/files are fully-primed into memory so iowaits are basically zero).

 

Yes, that’s the same query I’ve been running to analyze the locks and this is the problem:

 

SELECT state, backend_type, wait_event_type, wait_event, count(*) FROM pg_stat_activity GROUP BY 1, 2, 3, 4 ORDER BY count(*) DESC;

 

State      backend_type         wait_event_type wait_event      count

active     client backend       LWLock          LockManager     28

active     client backend                                       21

          autovacuum launcher   Activity        AutoVacuumMain  1

          logical replication launcher     Activity     LogicalLauncherMain   1

          checkpointer         Activity        CheckpointerMain 1

idle      client backend       Client          ClientRead 1

          background writer     Activity        BgWriterMain     1

          walwriter             Activity        WalWriterMain     1

 

Thanks again for any advice you have.

 

---Paul

 

Paul Friedman

CTO

 

 

677 Harrison St  |  San Francisco, CA 94107

M: (650) 270-7676

E-mail: paul.friedman@streetlightdata.com

 

-----Original Message-----
From: Andres Freund <andres@anarazel.de>
Sent: Monday, April 12, 2021 2:58 PM
To: Paul Friedman <paul.friedman@streetlightdata.com>
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: LWLocks by LockManager slowing large DB

 

Hi,

 

On 2021-04-12 12:37:42 -0700, Paul Friedman wrote:

> Boiling the complex queries down to their simplest form, we test

> running 60 of this query simultaneously:

 

How long does one execution of these queries take (on average)? The likely bottlenecks are very different between running 60 concurrent queries that each complete in 0.1ms and ones that take > 1s.

 

 

Could you show the results for a query like SELECT state, backend_type, wait_event_type, wait_event, count(*) FROM pg_stat_activity GROUP BY 1, 2, 3, 4 ORDER BY count(*) DESC; ?

 

Without knowing the proportion of LockManager wait events compared to the rest it's hard to know what to make of it.

 

 

> Does anyone have any advice on how to alleviate LockManager’s LWlock issue?

 

It'd be useful to get a perf profile for this. Both for cpu time and for what ends up blocking on kernel-level locks. E.g. something like

 

# cpu time

perf record --call-graph dwarf -F 500 -a sleep 5 perf report --no-children --sort comm,symbol

 

To send it to the list you can use something like perf report --no-children --sort comm,symbol|head -n 500 > somefile

 

# kernel level blocking on locks

perf record --call-graph dwarf -e syscalls:sys_enter_futex -a sleep 3 perf report --no-children --sort comm,symbol

 

Greetings,

 

Andres Freund

pgsql-performance by date:

Previous
From: Andres Freund
Date:
Subject: Re: LWLocks by LockManager slowing large DB
Next
From: Andres Freund
Date:
Subject: Re: LWLocks by LockManager slowing large DB