Home > mailing lists

Thread: Lock contention high

Lock contention high

From

Ashkil Dighin

Date:

12 October 2021, 07:35:12

Hi,

Lock contention observed high in PostgreSQLv13.3

The source code compiled with GNC(GCCv11.x)

PostgreSQL version: 13.3
Operating system: RHEL8.3

Kernel name:4.18.0-305.10.2.el8_4.x86_64

RAM Size:512GB

SSD: 1TB

The environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C

Perf data for 24vu(TPC-C)
--------------------------------

18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer

1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf

Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)

effective_cachesize=392 GB(1/3 or 75% of RAM size)

huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64

Best Regards

Anil

Re: Lock contention high

From

Mikhail Zhilin

Date:

12 October 2021, 10:29:33

Hi,

How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?

If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.

Thanks,
Michael.

On 10/12/21 10:35 AM, Ashkil Dighin wrote:

Hi,
Lock contention observed high in PostgreSQLv13.3
The source code compiled with GNC(GCCv11.x)
PostgreSQL version: 13.3
Operating system: RHEL8.3
Kernel name:4.18.0-305.10.2.el8_4.x86_64
RAM Size:512GB
SSD: 1TB
The environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C

Perf data for 24vu(TPC-C)
--------------------------------

18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer

1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf

Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)
effective_cachesize=392 GB(1/3 or 75% of RAM size)
huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64

Best Regards
Anil

Re: Lock contention high

From

Mladen Gogala

Date:

12 October 2021, 15:37:20

On 10/12/21 03:35, Ashkil Dighin wrote:

1.Is there a way to tune the lock contention ?

Lock contention is usually an application issue. Application processes are stepping on each other's toes. I have never seen a situation where the database would be slow with managing locks. Postgres uses an in-memory queue manager which is, generally speaking, very fast. Applications usually do stupid things. I've seen GUI doing "SELECT FOR UPDATE". And then the operator decided to have lunch. I'll leave the rest to your imagination.

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Lock contention high

From

Justin Pryzby

Date:

13 October 2021, 16:48:05

On Tue, Oct 12, 2021 at 01:05:12PM +0530, Ashkil Dighin wrote:
> Hi,
> Lock contention observed high in PostgreSQLv13.3
> The source code compiled with GNC(GCCv11.x)
> PostgreSQL version: 13.3
> Operating system:   RHEL8.3
> Kernel name:4.18.0-305.10.2.el8_4.x86_64
> RAM Size:512GB
> SSD: 1TB
> The environment used IBM metal and test benchmark environment HammerDbv4.2
> Test case :TPC-C
> 
> Perf data for 24vu(TPC-C)
> --------------------------------
> 
>       18.99%  postgres  postgres            [.] LWLockAcquire
>      7.09%  postgres  postgres            [.] _bt_compare
>      8.66%  postgres  postgres            [.] LWLockRelease
...
> 1.Is there a way to tune the lock contention ?
> 2.Is any recommendations to tune/reduce the lock contention via postgres.conf

I think you'd want to find *which* LW locks are being waited on, to see if it's
something that can be easily tuned.

You can check pg_stat_activity, or maybe create a cronjob to record its content
for later analysis.

-- 
Justin

Re: Lock contention high

From

MichaelDBA

Date:

13 October 2021, 18:15:34

1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf

I think you'd want to find *which* LW locks are being waited on, to see if it's
something that can be easily tuned.

You can check pg_stat_activity, or maybe create a cronjob to record its content
for later analysis.


Hello,

Also turn on log_lock_waits so you can evaluate the actual SQL causing the problems in the PG log files. Thinking ahead, you may want to consider if using advisory locks from the application side of things might be helpful to manage locks in a more pessimistic way. Also, join with pg_locks table to find out the specific resources that are in contention.

Regards,
Michael Vitale

Re: Lock contention high

From

Peter Geoghegan

Date:

13 October 2021, 19:05:35

On Tue, Oct 12, 2021 at 12:45 AM Ashkil Dighin <ashkildighin76@gmail.com> wrote:
> Lock contention observed high in PostgreSQLv13.3
> The source code compiled with GNC(GCCv11.x)
> PostgreSQL version: 13.3
> Operating system:   RHEL8.3
> Kernel name:4.18.0-305.10.2.el8_4.x86_64
> RAM Size:512GB
> SSD: 1TB
> The environment used IBM metal and test benchmark environment HammerDbv4.2
> Test case :TPC-C

You didn't say how many TPC-C warehouses you used. In my experience,
people sometimes run TPC-C with relatively few, which will tend to
result in extreme contention on certain B-Tree leaf pages. (My
experiences are with BenchmarkSQL, but I can't imagine HammerDB is too
much different.)

Assuming that's the case here, for you, then it's not clear that you
have a real problem. You're really not supposed to run the benchmark
in that way, per the TPC-C spec, which strictly limits the number of
transactions per minute per warehouse -- for better or worse, valid
results generally require that you use lots of warehouses to get a
very large database (think terabytes). If you run the benchmark with
100 warehouses or less, on a big server, then the contention you'll
see will be out of all proportion to what you're ever likely to see in
the real world.

-- 
Peter Geoghegan

Re: Lock contention high

From

Jeremy Schneider

Date:

14 October 2021, 01:54:22

Your settings are interesting, I'm curious what the goal is for this
particular hammerdb exercise.

A few comments inline

On 10/12/21 00:35, Ashkil Dighin wrote:
> 
> Postgres.conf used  in Baremetal
> ========================
> maintenance_work_mem = 512MB           

only a half GB memory for autovac? (it will have a mandatory run as soon
as you hit 200 mil XIDs, seems like you'd want the full max 1GB for it)

> synchronous_commit = off            
> checkpoint_timeout = 1h         
> checkpoint_completion_target = 1       
> checkpoint_warning = 0         

curious about this, seems you're just looking to understand how much
throughput you can get with a config that would not be used on a real system

> autovacuum = off                       

i assume you understand that autovacuum will still run when you hit 200
mil XIDs. this setting seems incongruent with the previous settings,
because it seemed like you were going for throughput, which generally
requires autovacuum to be more aggressive rather than less aggressive.
assuming the benchmark runs for a properly sufficient length of time,
this setting will slow things down because of accumulating bloat.

Just a few opinions, I might be wrong, hope the feedback is helpful.  :)

-Jeremy

-- 
http://about.me/jeremy_schneider

Re: Lock contention high

From

Peter Geoghegan

Date:

14 October 2021, 02:11:41

On Wed, Oct 13, 2021 at 6:54 PM Jeremy Schneider
<schneider@ardentperf.com> wrote:
> only a half GB memory for autovac? (it will have a mandatory run as soon
> as you hit 200 mil XIDs, seems like you'd want the full max 1GB for it)

While anti-wraparound vacuums will become a problem for TPC-C (unless
you tune for it), it's not too sensitive to mwm. You just don't end up
accumulating too many TIDs to delete from indexes in practice, even
though the overhead from VACUUM is a concern. The new autovacuum
instrumentation in Postgres 14 makes this far clearer.

-- 
Peter Geoghegan

Re: Lock contention high

From

Ashkil Dighin

Date:

14 October 2021, 06:03:58

Hi

Captured the concurrent session with Netsat and pg-stat-actvity. Is the procedure the right way to capture concurrent sesssions in postgresql?

netstat -a | grep postgres tcp 0 0 0.0.0.0:postgres 0.0.0.0:* LISTEN tcp 0 0 :postgres :53984 ESTABLISHED tcp 0 0 :postgres :54012 ESTABLISHED tcp 0 74 :postgres :53998 ESTABLISHED tcp 0 73 :53986 :postgres ESTABLISHED tcp 0 0 :54004 :postgres ESTABLISHED tcp 0 75 :53990 :postgres ESTABLISHED tcp 0 0 :postgres :53994 ESTABLISHED tcp 0 0 :postgres :54004 ESTABLISHED tcp 0 106 :53978 :postgres ESTABLISHED tcp 0 0 :postgres :53972 ESTABLISHED tcp 0 90 :54000 :postgres ESTABLISHED tcp 0 0 :postgres :54018 ESTABLISHED tcp 0 0 :54016 :postgres ESTABLISHED tcp 0 0 :postgres :53986 ESTABLISHED tcp 0 59 :54006 :postgres ESTABLISHED tcp 0 74 :postgres :53982 ESTABLISHED tcp 0 75 :53994 :postgres ESTABLISHED tcp 0 0 :53970 :postgres ESTABLISHED tcp 0 0 :postgres :53974 ESTABLISHED tcp 0 76 :53988 :postgres ESTABLISHED tcp 0 0 :postgres :54008 ESTABLISHED tcp 0 93 :54014 :postgres ESTABLISHED tcp 0 74 :54012 :postgres ESTABLISHED tcp 0 75 :53972 :postgres ESTABLISHED tcp 0 76 :54002 :postgres ESTABLISHED tcp 0 68 :postgres :54006 ESTABLISHED tcp 0 0 :postgres :53978 ESTABLISHED tcp 0 73 :54008 :postgres ESTABLISHED tcp 0 0 :postgres :53976 ESTABLISHED tcp 0 93 :53974 :postgres ESTABLISHED tcp 0 59 :53998 :postgres ESTABLISHED tcp 74 0 :53984 :postgres ESTABLISHED tcp 0 0 :postgres :54014 ESTABLISHED tcp 0 76 :53982 :postgres ESTABLISHED tcp 0 0 :postgres :54002 ESTABLISHED tcp 0 76 :53996 :postgres ESTABLISHED tcp 0 0 :postgres :53990 ESTABLISHED tcp 0 59 :53976 :postgres ESTABLISHED tcp 0 74 :postgres :53996 ESTABLISHED tcp 0 76 :53992 :postgres ESTABLISHED tcp 0 0 :postgres :54016 ESTABLISHED tcp 0 0 :postgres :54000 ESTABLISHED tcp 0 0 :postgres :53980 ESTABLISHED tcp 0 77 :53980 :postgres ESTABLISHED tcp 0 74 :54018 :postgres ESTABLISHED tcp 0 0 :postgres :53970 ESTABLISHED tcp 0 0 :postgres :53988 ESTABLISHED tcp 0 104 :54010 :postgres ESTABLISHED tcp 0 0 :postgres :54010 ESTABLISHED tcp 0 0 :postgres :53992 ESTABLISHED tcp6 0 0 [::]:postgres

Select pg_stat_activity

datid | datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type -------+----------+---------+------------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+-------------------------------------------------------------------------------------------------+------------------------------ | | 2092230 | | 10 | postgres | | | | | 2021-10-13 02:41:12.083391-04 | | | | Activity | LogicalLauncherMain | | | | | logical replication launcher 16385 | tpcc | 2092540 | | 16384 | tpcc | | 127.0.0.1 | | 53970 | 2021-10-13 02:41:57.336031-04 | | 2021-10-13 02:43:58.97025-04 | 2021-10-13 02:43:58.971538-04 | Client | ClientRead | idle | | | select sum(d_next_o_id) from district | client backend 16385 | tpcc | 2092541 | | 16384 | tpcc | | 127.0.0.1 | | 53972 | 2021-10-13 02:41:57.836054-04 | 2021-10-13 02:44:04.649045-04 | 2021-10-13 02:44:04.649054-04 | 2021-10-13 02:44:04.649055-04 | | | active | 11301598 | 11301493 | prepare delivery (INTEGER, INTEGER) AS select delivery($1,$2) | client backend 16385 | tpcc | 2092548 | | 16384 | tpcc | | 127.0.0.1 | | 53974 | 2021-10-13 02:41:58.336566-04 | 2021-10-13 02:44:04.649153-04 | 2021-10-13 02:44:04.649163-04 | 2021-10-13 02:44:04.649163-04 | | | active | 11301611 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092549 | | 16384 | tpcc | | 127.0.0.1 | | 53976 | 2021-10-13 02:41:58.836269-04 | 2021-10-13 02:44:04.649443-04 | 2021-10-13 02:44:04.649454-04 | 2021-10-13 02:44:04.649454-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092556 | | 16384 | tpcc | | 127.0.0.1 | | 53978 | 2021-10-13 02:41:59.336172-04 | 2021-10-13 02:44:04.648817-04 | 2021-10-13 02:44:04.648827-04 | 2021-10-13 02:44:04.648828-04 | | | active | | 11301493 | prepare slev (INTEGER, INTEGER, INTEGER) AS select slev($1,$2,$3) | client backend 16385 | tpcc | 2092557 | | 16384 | tpcc | | 127.0.0.1 | | 53980 | 2021-10-13 02:41:59.83835-04 | 2021-10-13 02:44:04.649027-04 | 2021-10-13 02:44:04.649036-04 | 2021-10-13 02:44:04.649036-04 | | | active | | 11301493 | prepare slev (INTEGER, INTEGER, INTEGER) AS select slev($1,$2,$3) | client backend 16385 | tpcc | 2092564 | | 16384 | tpcc | | 127.0.0.1 | | 53982 | 2021-10-13 02:42:00.336974-04 | 2021-10-13 02:44:04.649194-04 | 2021-10-13 02:44:04.649203-04 | 2021-10-13 02:44:04.649203-04 | | | active | 11301619 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092565 | | 16384 | tpcc | | 127.0.0.1 | | 53984 | 2021-10-13 02:42:00.838269-04 | 2021-10-13 02:44:04.649441-04 | 2021-10-13 02:44:04.649452-04 | 2021-10-13 02:44:04.649453-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092572 | | 16384 | tpcc | | 127.0.0.1 | | 53986 | 2021-10-13 02:42:01.337933-04 | 2021-10-13 02:44:04.648136-04 | 2021-10-13 02:44:04.648144-04 | 2021-10-13 02:44:04.648144-04 | | | active | 11301528 | 11301396 | prepare delivery (INTEGER, INTEGER) AS select delivery($1,$2) | client backend 16385 | tpcc | 2092573 | | 16384 | tpcc | | 127.0.0.1 | | 53988 | 2021-10-13 02:42:01.839434-04 | 2021-10-13 02:44:04.648999-04 | 2021-10-13 02:44:04.649007-04 | 2021-10-13 02:44:04.649007-04 | LWLock | ProcArray | active | 11301596 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092580 | | 16384 | tpcc | | 127.0.0.1 | | 53990 | 2021-10-13 02:42:02.339335-04 | 2021-10-13 02:44:04.649463-04 | 2021-10-13 02:44:04.649474-04 | 2021-10-13 02:44:04.649474-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092581 | | 16384 | tpcc | | 127.0.0.1 | | 53992 | 2021-10-13 02:42:02.838867-04 | 2021-10-13 02:44:04.649161-04 | 2021-10-13 02:44:04.64917-04 | 2021-10-13 02:44:04.64917-04 | | | active | 11301616 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092588 | | 16384 | tpcc | | 127.0.0.1 | | 53994 | 2021-10-13 02:42:03.343136-04 | 2021-10-13 02:44:04.64934-04 | 2021-10-13 02:44:04.649351-04 | 2021-10-13 02:44:04.649352-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092589 | | 16384 | tpcc | | 127.0.0.1 | | 53996 | 2021-10-13 02:42:03.839278-04 | 2021-10-13 02:44:04.648822-04 | 2021-10-13 02:44:04.648834-04 | 2021-10-13 02:44:04.648834-04 | | | active | | | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092596 | | 16384 | tpcc | | 127.0.0.1 | | 53998 | 2021-10-13 02:42:04.34021-04 | 2021-10-13 02:44:04.649134-04 | 2021-10-13 02:44:04.649143-04 | 2021-10-13 02:44:04.649144-04 | | | active | 11301614 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092597 | | 16384 | tpcc | | 127.0.0.1 | | 54000 | 2021-10-13 02:42:04.840163-04 | 2021-10-13 02:44:04.649429-04 | 2021-10-13 02:44:04.649438-04 | 2021-10-13 02:44:04.649438-04 | | | active | | 11301528 | prepare delivery (INTEGER, INTEGER) AS select delivery($1,$2) | client backend 16385 | tpcc | 2092604 | | 16384 | tpcc | | 127.0.0.1 | | 54002 | 2021-10-13 02:42:05.340832-04 | 2021-10-13 02:44:04.649156-04 | 2021-10-13 02:44:04.649166-04 | 2021-10-13 02:44:04.649166-04 | LWLock | WALInsert | active | 11301618 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092605 | | 16384 | tpcc | | 127.0.0.1 | | 54004 | 2021-10-13 02:42:05.841658-04 | 2021-10-13 02:44:04.649089-04 | 2021-10-13 02:44:04.649099-04 | 2021-10-13 02:44:04.6491-04 | | | active | 11301608 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092612 | | 16384 | tpcc | | 127.0.0.1 | | 54006 | 2021-10-13 02:42:06.342751-04 | 2021-10-13 02:44:04.649428-04 | 2021-10-13 02:44:04.649437-04 | 2021-10-13 02:44:04.649437-04 | | | active | | 11301528 | prepare delivery (INTEGER, INTEGER) AS select delivery($1,$2) | client backend 16385 | tpcc | 2092613 | | 16384 | tpcc | | 127.0.0.1 | | 54008 | 2021-10-13 02:42:06.841509-04 | 2021-10-13 02:44:04.649237-04 | 2021-10-13 02:44:04.649249-04 | 2021-10-13 02:44:04.649249-04 | | | active | 11301622 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092620 | | 16384 | tpcc | | 127.0.0.1 | | 54010 | 2021-10-13 02:42:07.341743-04 | 2021-10-13 02:44:04.648736-04 | 2021-10-13 02:44:04.648746-04 | 2021-10-13 02:44:04.648746-04 | | | active | 11301580 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092621 | | 16384 | tpcc | | 127.0.0.1 | | 54012 | 2021-10-13 02:42:07.841876-04 | 2021-10-13 02:44:04.648983-04 | 2021-10-13 02:44:04.648991-04 | 2021-10-13 02:44:04.648991-04 | | | active | 11301600 | 11301493 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092628 | | 16384 | tpcc | | 127.0.0.1 | | 54014 | 2021-10-13 02:42:08.342179-04 | 2021-10-13 02:44:04.649464-04 | 2021-10-13 02:44:04.649473-04 | 2021-10-13 02:44:04.649474-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 16385 | tpcc | 2092629 | | 16384 | tpcc | | 127.0.0.1 | | 54016 | 2021-10-13 02:42:08.845321-04 | 2021-10-13 02:44:04.649456-04 | 2021-10-13 02:44:04.649472-04 | 2021-10-13 02:44:04.649472-04 | | | active | | 11301528 | prepare slev (INTEGER, INTEGER, INTEGER) AS select slev($1,$2,$3) | client backend 16385 | tpcc | 2092636 | | 16384 | tpcc | | 127.0.0.1 | | 54018 | 2021-10-13 02:42:09.341768-04 | 2021-10-13 02:44:04.649394-04 | 2021-10-13 02:44:04.649404-04 | 2021-10-13 02:44:04.649404-04 | | | active | | 11301528 | prepare neword (INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) as select neword($1,$2,$3,$4,$5,0) | client backend 12711 | postgres | 2093365 | | 10 | postgres | psql | | | -1 | 2021-10-13 02:44:04.64633-04 | 2021-10-13 02:44:04.648186-04 | 2021-10-13 02:44:04.648186-04 | 2021-10-13 02:44:04.648186-04 | | | active | | 11301528 | select * from pg_stat_activity; | client backend | | 2092227 | | | | | | | | 2021-10-13 02:41:12.082448-04 | | | | Activity | BgWriterMain | | | | | background writer | | 2092226 | | | | | | | | 2021-10-13 02:41:12.081979-04 | | | | Activity | CheckpointerMain | | | | | checkpointer | | 2092228 | | | |

On Tuesday, October 12, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Tue, 2021-10-12 at 13:05 +0530, Ashkil Dighin wrote:
> Perf data for 24vu(TPC-C)
> --------------------------------
>
> 18.99% postgres postgres [.] LWLockAcquire
> 7.09% postgres postgres [.] _bt_compare
> 8.66% postgres postgres [.] LWLockRelease
> 2.28% postgres postgres [.] GetSnapshotData
> 2.25% postgres postgres [.] hash_search_with_hash_value
> 2.11% postgres postgres [.] XLogInsertRecord
> 1.98% postgres postgres [.] PinBuffer
>
> 1.Is there a way to tune the lock contention ?

How many concurrent sesions are you running?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: Lock contention high

From

Laurenz Albe

Date:

14 October 2021, 06:42:57

On Thu, 2021-10-14 at 11:33 +0530, Ashkil Dighin wrote:
> Captured the concurrent session with Netsat and pg-stat-actvity. Is the procedure the right way to capture concurrent
sesssionsin postgresql?
 
> 
> Select pg_stat_activity 

[some two dozen sessions]

That doesn't look like you would get into trouble just from the
sheer number of sessions, so it must be something else.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Lock contention high

From

Ashkil Dighin

Date:

14 October 2021, 06:45:19

NUMA node0 CPU(s):   0-63,128-191NUMA node1 CPU(s):   64-127,192-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s):           2
NUMA node(s):        2
corepinning(ta perf lock contention results for 24,32 vu
0-63
  24: 18.03% postgres postgres            [.] LWLockAcquire
  32: 7.02% postgres postgres             [.] LWLockAcquire
64-127
  24: 17.96% postgres postgres            [.] LWLockAcquire
  32: 7.04% postgres postgres             [.] LWLockAcquire
0-63,128-191(Node0)
  24: 18.4% postgres postgres            [.] LWLockAcquire
  32: 7.07% postgres postgres            [.] LWLockAcquire
64-127,192-255(Node1)
  24: 18.3% postgres postgres            [.] LWLockAcquire
  32: 7.06% postgres postgres            [.] LWLockAcquireI do not understand on interconnect type and has restrictions on lscpu .

On Tuesday, October 12, 2021, Mikhail Zhilin <zhilin.ms@phystech.edu> wrote:

Hi,

How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?

If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.

Thanks,
Michael.

On 10/12/21 10:35 AM, Ashkil Dighin wrote:

Hi,
Lock contention observed high in PostgreSQLv13.3
The source code compiled with GNC(GCCv11.x)
PostgreSQL version: 13.3
Operating system: RHEL8.3
Kernel name:4.18.0-305.10.2.el8_4.x86_64
RAM Size:512GB
SSD: 1TB
The environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C

Perf data for 24vu(TPC-C)
--------------------------------

18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer

1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf

Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)
effective_cachesize=392 GB(1/3 or 75% of RAM size)
huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64

Best Regards
Anil

Re: Lock contention high

From

Ashkil Dighin

Date:

14 October 2021, 06:49:21

Not using PostGIS

On Thursday, October 14, 2021, Paul Friedman <paul.friedman@streetlightdata.com> wrote:

Are you using PostGIS?

If so, there is an issue with TOAST table locking having these symptoms.

---Paul
On Wed, Oct 13, 2021 at 11:15 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:
1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf
I think you'd want to find *which* LW locks are being waited on, to see if it's
something that can be easily tuned.

You can check pg_stat_activity, or maybe create a cronjob to record its content
for later analysis.


Hello,
Also turn on log_lock_waits so you can evaluate the actual SQL causing the problems in the PG log files. Thinking ahead, you may want to consider if using advisory locks from the application side of things might be helpful to manage locks in a more pessimistic way. Also, join with pg_locks table to find out the specific resources that are in contention.

Regards,
Michael Vitale

Re: Lock contention high

From

Mikhail Zhilin

Date:

14 October 2021, 07:47:37

Ashkil,

Can you bind postgres in single NUMA node, for instance:
$ taskset -pc 0-63 <POSTMASTER_PID>

Then run your benchmark, compare results in terms of benchmark metrics & presence on LWLock(Acquire|Release) in perf top.

BR,
Michael.

On 10/14/21 9:45 AM, Ashkil Dighin wrote:

NUMA node0 CPU(s):   0-63,128-191NUMA node1 CPU(s):   64-127,192-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s):           2
NUMA node(s):        2
corepinning(ta perf lock contention results for 24,32 vu
0-63
  24: 18.03% postgres postgres            [.] LWLockAcquire
  32: 7.02% postgres postgres             [.] LWLockAcquire
64-127
  24: 17.96% postgres postgres            [.] LWLockAcquire
  32: 7.04% postgres postgres             [.] LWLockAcquire
0-63,128-191(Node0)
  24: 18.4% postgres postgres            [.] LWLockAcquire
  32: 7.07% postgres postgres            [.] LWLockAcquire
64-127,192-255(Node1)
  24: 18.3% postgres postgres            [.] LWLockAcquire
  32: 7.06% postgres postgres            [.] LWLockAcquire I do not understand on interconnect type and has restrictions on lscpu .

On Tuesday, October 12, 2021, Mikhail Zhilin <zhilin.ms@phystech.edu> wrote:
Hi,

How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?

If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.

Thanks,
Michael.

On 10/12/21 10:35 AM, Ashkil Dighin wrote:

Hi,
Lock contention observed high in PostgreSQLv13.3
The source code compiled with GNC(GCCv11.x)
PostgreSQL version: 13.3
Operating system: RHEL8.3
Kernel name:4.18.0-305.10.2.el8_4.x86_64
RAM Size:512GB
SSD: 1TB
The environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C

Perf data for 24vu(TPC-C)
--------------------------------

18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer

1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf

Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)
effective_cachesize=392 GB(1/3 or 75% of RAM size)
huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64

Best Regards
Anil

Re: Lock contention high

From

Ashkil Dighin

Date:

20 October 2021, 10:51:38

Hi
B-tree index used in the postgres environment
Checked on warehouse different values like 100,800,1600,2400 and 3200 with virtual user 64
On different values(warehouse) the lock contention same i.e. approx 17% and iostat usage is 30-40%

pg_Count_ware=100
-----------------
17.76% postgres postgres [.] LWLockAcquire
4.88% postgres postgres [.] _bt_compare
3.10% postgres postgres [.] LWLockRelease

pg_Count_ware=800(previously I used Warehouse 800)
--------------------------------------------
17.91% postgres postgres [.] LWLockAcquire
5.76% postgres postgres [.] _bt_compare
3.06% postgres postgres [.] LWLockRelease

pg_Count_ware_1600
-----------------
17.80% postgres postgres [.] LWLockAcquire
5.88% postgres postgres [.] _bt_compare
2.70% postgres postgres [.] LWLockRelease

pg_Count_ware_2400
------------------
17.77% postgres postgres [.] LWLockAcquire
6.01% postgres postgres [.] _bt_compare
2.71% postgres postgres [.] LWLockRelease

pg_Count_ware_3200
------------------
17.46% postgres postgres [.] LWLockAcquire
6.32% postgres postgres [.] _bt_compare
2.86% postgres postgres [.] hash_search_with_hash_value

1.Tired different values of lock management values in postgres.conf but it not helped to reduce lock contention.
   deadlock_timeout = 5s
   max_locks_per_transaction = 64
   max_pred_locks_per_transaction = 64
   max_pred_locks_per_relation = -2
   max_pred_locks_per_page = 2
2.Intention to check the postgreSQL scalability and performance or throughput(TPC-C/TPC-H)
with HammerDB and pgbench with server configuration on tune settings(postgresql.conf)-reduce lock contention
CPU's :256
Threadper core: 2
Core per socket: 64
Sockets:           2
NUMA node0 :   0-63,128-191
NUMA node1 :   64-127,192-255
RAM size :512GB
SSD :1TB

Ref link:https://www.hammerdb.com/blog/uncategorized/hammerdb-best-practice-for-postgresql-performance-and-scalability/
On Thursday, October 14, 2021, Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Oct 12, 2021 at 12:45 AM Ashkil Dighin <ashkildighin76@gmail.com> wrote:
> Lock contention observed high in PostgreSQLv13.3
> The source code compiled with GNC(GCCv11.x)
> PostgreSQL version: 13.3
> Operating system: RHEL8.3
> Kernel name:4.18.0-305.10.2.el8_4.x86_64
> RAM Size:512GB
> SSD: 1TB
> The environment used IBM metal and test benchmark environment HammerDbv4.2
> Test case :TPC-C

You didn't say how many TPC-C warehouses you used. In my experience,
people sometimes run TPC-C with relatively few, which will tend to
result in extreme contention on certain B-Tree leaf pages. (My
experiences are with BenchmarkSQL, but I can't imagine HammerDB is too
much different.)

Assuming that's the case here, for you, then it's not clear that you
have a real problem. You're really not supposed to run the benchmark
in that way, per the TPC-C spec, which strictly limits the number of
transactions per minute per warehouse -- for better or worse, valid
results generally require that you use lots of warehouses to get a
very large database (think terabytes). If you run the benchmark with
100 warehouses or less, on a big server, then the contention you'll
see will be out of all proportion to what you're ever likely to see in
the real world.

--
Peter Geoghegan

Re: Lock contention high

From

Andres Freund

Date:

25 October 2021, 23:35:46

Hi,

On 2021-10-12 13:05:12 +0530, Ashkil Dighin wrote:
> PostgreSQL version: 13.3

You could try postgres 14 - that did improve scalability in some areas.

> Perf data for 24vu(TPC-C)
> --------------------------------
> 
>       18.99%  postgres  postgres            [.] LWLockAcquire
>      7.09%  postgres  postgres            [.] _bt_compare
>      8.66%  postgres  postgres            [.] LWLockRelease
>      2.28%  postgres  postgres            [.] GetSnapshotData
>      2.25%  postgres  postgres            [.] hash_search_with_hash_value
>      2.11%  postgres  postgres            [.] XLogInsertRecord
>      1.98%  postgres  postgres            [.] PinBuffer

To be more useful you'd need to create a profile with 'caller' information
using 'perf record --call-graph dwarf', and then check what the important
callers are.

> Postgres.conf used  in Baremetal
> ========================
> shared_buffers = 128GB(1/4 th RAM size)
> effective_cachesize=392 GB(1/3 or 75% of RAM size)

If your hot data set is actually larger than s_b, I'd recommend trying a
larger s_b. It's plausible that a good chunk of lock contention is from that.

Greetings,

Andres Freund

Re: Lock contention high

From

Michael Lewis

Date:

26 October 2021, 00:38:40

On Mon, Oct 25, 2021, 5:36 PM Andres Freund <andres@anarazel.de> wrote:

If your hot data set is actually larger than s_b, I'd recommend trying a

larger s_b. It's plausible that a good chunk of lock contention is from that.

How much larger might you go? Any write ups on lock contention as it relates to shared buffers? How impactful might huge pages (off, transparent or on) be to the use of shared buffers and the related locking mechanism?

Re: Lock contention high

From

Andres Freund

Date:

26 October 2021, 00:43:23

Hi,

On 2021-10-25 18:38:40 -0600, Michael Lewis wrote:
> On Mon, Oct 25, 2021, 5:36 PM Andres Freund <andres@anarazel.de> wrote:
> If your hot data set is actually larger than s_b, I'd recommend trying a
> larger s_b. It's plausible that a good chunk of lock contention is from
> that.

> How much larger might you go?

I've seen s_b in the ~700GB range being a considerable speedup over lower
values quite a few years ago. I don't see a clear cut upper boundary. The one
thing this can regress measurably is the speed of dropping / truncating
tables.

> Any write ups on lock contention as it relates to shared buffers?

I don't have a concrete thing to point you to, but searching for
NUM_BUFFER_PARTITIONS might point you to some discussions.

> How impactful might huge pages (off, transparent or on) be to the use of
> shared buffers and the related locking mechanism?

Using huge pages can *hugely* help performance-wise. Not directly by relieving
postgres-side contention however (it does reduce cache usage somewhat, but
it's mainly really just the frequency of TLB misses that makes the
difference).

Greetings,

Andres Freund

Re: Lock contention high

From

Ashkil Dighin

Date:

27 October 2021, 21:44:56

Hi,

Yes, lock contention reduced with postgresqlv14.

Lock acquire reduced 18% to 10%

10.49 %postgres postgres [.] LWLockAcquire
5.09% postgres postgres [.] _bt_compare

Is lock contention can be reduced to 0-3%?

On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”

On Tuesday, October 26, 2021, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-10-12 13:05:12 +0530, Ashkil Dighin wrote:
> PostgreSQL version: 13.3

You could try postgres 14 - that did improve scalability in some areas.

> Perf data for 24vu(TPC-C)
> --------------------------------
>
> 18.99% postgres postgres [.] LWLockAcquire
> 7.09% postgres postgres [.] _bt_compare
> 8.66% postgres postgres [.] LWLockRelease
> 2.28% postgres postgres [.] GetSnapshotData
> 2.25% postgres postgres [.] hash_search_with_hash_value
> 2.11% postgres postgres [.] XLogInsertRecord
> 1.98% postgres postgres [.] PinBuffer

To be more useful you'd need to create a profile with 'caller' information
using 'perf record --call-graph dwarf', and then check what the important
callers are.

> Postgres.conf used in Baremetal
> ========================
> shared_buffers = 128GB(1/4 th RAM size)
> effective_cachesize=392 GB(1/3 or 75% of RAM size)

If your hot data set is actually larger than s_b, I'd recommend trying a
larger s_b. It's plausible that a good chunk of lock contention is from that.

Greetings,

Andres Freund

Re: Lock contention high

From

Andres Freund

Date:

27 October 2021, 22:22:01

Hi,

On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres  postgres            [.] LWLockAcquire
>5.09%  postgres  postgres            [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?

Probably not, or at least not easily. Because of the atomic instructions the locking also includes  some other costs
(e.g.cache misses, serializing store buffers,...). 

There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low
I'dguess. 

>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”

Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that
much..

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Lock contention high

From

Ashkil Dighin

Date:

12 November 2021, 14:12:30

Hi

I suspect lock contention and performance issues with __int128. And I would like to check the performance by forcibly disabling int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).

Is it possible to disable int128 in PostgreSQL?

On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:

Hi,

On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?

Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).

There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.

>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”

Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Lock contention high

From

Tom Lane

Date:

15 November 2021, 15:25:31

Ashkil Dighin <ashkildighin76@gmail.com> writes:
> I suspect lock contention and performance issues with __int128. And I would
> like to check the performance by forcibly disabling int128(Maxalign16bytes)
> and enable like long long(maxlign 8bytes).
>  Is it possible to disable int128 in PostgreSQL?

Sure, you can build without it --- easiest way would be to modify
pg_config.h after the configure step.  But the idea that it has
something to do with lock contention seems like nonsense.

            regards, tom lane

Re: Lock contention high

From

arjun shetty

Date:

17 November 2021, 06:43:12

Hi Askhil

PostgreSQL utilizes lightweight locks(LWLocks) to synchronize and control access to the buffer content. A process acquires an LWLock in a shared mode to read from the buffer and an exclusive mode to write to the buffer. Therefore, while holding an exclusive lock, a process prevents other processes from acquiring a shared or exclusive lock. Also, a shared lock can be acquired concurrently by other processes. The issue starts when many processes acquire an exclusive lock on buffer content. As a result, LwlockAcquire seen as top hot function in profilng.

Here need to understand LwlockAcquire is lock contention or cpu time spent inside the method/ function(top function in profiling)

It can analysed log “LwStatus” with parameters like ex-acquire-count(exclusive mode) , sh-acquire-count , block-count and spin-delay-count

Total lock acquisition request = ex-acquire-count+sh-acquire-count)

Time lock contention %= block count)/ Total lock acquisition request.

Time lock contention may provide as most of cpu time inside the function rather than spinning/ waiting for lock.

On Friday, November 12, 2021, Ashkil Dighin <ashkildighin76@gmail.com> wrote:

Hi
I suspect lock contention and performance issues with __int128. And I would like to check the performance by forcibly disabling int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).
Is it possible to disable int128 in PostgreSQL?

On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:
Hi,

On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?

Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).

There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.

>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”

Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Lock contention high

From

arjun shetty

Date:

29 November 2021, 12:39:43

1. How to check which NUMA node in PostgreSQL process fetching from the memory?

2. Is NUMA configuration is better for PostgreSQL?

vm.zone_reclaim_mode= 0

numactl --interleave = all /init.d/ PostgreSQL start

kernel.numa_balancing= 0

On Wednesday, November 17, 2021, arjun shetty <arjunshetty955@gmail.com> wrote:

Hi Askhil

PostgreSQL utilizes lightweight locks(LWLocks) to synchronize and control access to the buffer content. A process acquires an LWLock in a shared mode to read from the buffer and an exclusive mode to write to the buffer. Therefore, while holding an exclusive lock, a process prevents other processes from acquiring a shared or exclusive lock. Also, a shared lock can be acquired concurrently by other processes. The issue starts when many processes acquire an exclusive lock on buffer content. As a result, LwlockAcquire seen as top hot function in profilng.
Here need to understand LwlockAcquire is lock contention or cpu time spent inside the method/ function(top function in profiling)

It can analysed log “LwStatus” with parameters like ex-acquire-count(exclusive mode) , sh-acquire-count , block-count and spin-delay-count

Total lock acquisition request = ex-acquire-count+sh-acquire-count)
Time lock contention %= block count)/ Total lock acquisition request.

Time lock contention may provide as most of cpu time inside the function rather than spinning/ waiting for lock.

On Friday, November 12, 2021, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
Hi
I suspect lock contention and performance issues with __int128. And I would like to check the performance by forcibly disabling int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).
Is it possible to disable int128 in PostgreSQL?

On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:
Hi,

On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?

Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).

There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.

>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”

Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.