Thread: Lock contention high
Operating system: RHEL8.3
Test case :TPC-C
Perf data for 24vu(TPC-C)
------------------------------
18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer
2.Is any recommendations to tune/reduce the lock contention via postgres.conf
Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64
How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?
If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.
Thanks,
Michael.
Hi,Lock contention observed high in PostgreSQLv13.3The source code compiled with GNC(GCCv11.x)PostgreSQL version: 13.3
Operating system: RHEL8.3Kernel name:4.18.0-305.10.2.el8_4.x86_64 RAM Size:512GBSSD: 1TBThe environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C
Perf data for 24vu(TPC-C)
--------------------------------
18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf
Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)effective_cachesize=392 GB(1/3 or 75% of RAM size)huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64Best RegardsAnil
1.Is there a way to tune the lock contention ?
Lock contention is usually an application issue. Application processes are stepping on each other's toes. I have never seen a situation where the database would be slow with managing locks. Postgres uses an in-memory queue manager which is, generally speaking, very fast. Applications usually do stupid things. I've seen GUI doing "SELECT FOR UPDATE". And then the operator decided to have lunch. I'll leave the rest to your imagination.
-- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
On Tue, Oct 12, 2021 at 01:05:12PM +0530, Ashkil Dighin wrote: > Hi, > Lock contention observed high in PostgreSQLv13.3 > The source code compiled with GNC(GCCv11.x) > PostgreSQL version: 13.3 > Operating system: RHEL8.3 > Kernel name:4.18.0-305.10.2.el8_4.x86_64 > RAM Size:512GB > SSD: 1TB > The environment used IBM metal and test benchmark environment HammerDbv4.2 > Test case :TPC-C > > Perf data for 24vu(TPC-C) > -------------------------------- > > 18.99% postgres postgres [.] LWLockAcquire > 7.09% postgres postgres [.] _bt_compare > 8.66% postgres postgres [.] LWLockRelease ... > 1.Is there a way to tune the lock contention ? > 2.Is any recommendations to tune/reduce the lock contention via postgres.conf I think you'd want to find *which* LW locks are being waited on, to see if it's something that can be easily tuned. You can check pg_stat_activity, or maybe create a cronjob to record its content for later analysis. -- Justin
1.Is there a way to tune the lock contention ? 2.Is any recommendations to tune/reduce the lock contention via postgres.conf
I think you'd want to find *which* LW locks are being waited on, to see if it's something that can be easily tuned. You can check pg_stat_activity, or maybe create a cronjob to record its content for later analysis. Hello,Also turn on log_lock_waits so you can evaluate the actual SQL causing the problems in the PG log files. Thinking ahead, you may want to consider if using advisory locks from the application side of things might be helpful to manage locks in a more pessimistic way. Also, join with pg_locks table to find out the specific resources that are in contention.
Regards,
Michael Vitale
On Tue, Oct 12, 2021 at 12:45 AM Ashkil Dighin <ashkildighin76@gmail.com> wrote: > Lock contention observed high in PostgreSQLv13.3 > The source code compiled with GNC(GCCv11.x) > PostgreSQL version: 13.3 > Operating system: RHEL8.3 > Kernel name:4.18.0-305.10.2.el8_4.x86_64 > RAM Size:512GB > SSD: 1TB > The environment used IBM metal and test benchmark environment HammerDbv4.2 > Test case :TPC-C You didn't say how many TPC-C warehouses you used. In my experience, people sometimes run TPC-C with relatively few, which will tend to result in extreme contention on certain B-Tree leaf pages. (My experiences are with BenchmarkSQL, but I can't imagine HammerDB is too much different.) Assuming that's the case here, for you, then it's not clear that you have a real problem. You're really not supposed to run the benchmark in that way, per the TPC-C spec, which strictly limits the number of transactions per minute per warehouse -- for better or worse, valid results generally require that you use lots of warehouses to get a very large database (think terabytes). If you run the benchmark with 100 warehouses or less, on a big server, then the contention you'll see will be out of all proportion to what you're ever likely to see in the real world. -- Peter Geoghegan
Your settings are interesting, I'm curious what the goal is for this particular hammerdb exercise. A few comments inline On 10/12/21 00:35, Ashkil Dighin wrote: > > Postgres.conf used in Baremetal > ======================== > maintenance_work_mem = 512MB only a half GB memory for autovac? (it will have a mandatory run as soon as you hit 200 mil XIDs, seems like you'd want the full max 1GB for it) > synchronous_commit = off > checkpoint_timeout = 1h > checkpoint_completion_target = 1 > checkpoint_warning = 0 curious about this, seems you're just looking to understand how much throughput you can get with a config that would not be used on a real system > autovacuum = off i assume you understand that autovacuum will still run when you hit 200 mil XIDs. this setting seems incongruent with the previous settings, because it seemed like you were going for throughput, which generally requires autovacuum to be more aggressive rather than less aggressive. assuming the benchmark runs for a properly sufficient length of time, this setting will slow things down because of accumulating bloat. Just a few opinions, I might be wrong, hope the feedback is helpful. :) -Jeremy -- http://about.me/jeremy_schneider
On Wed, Oct 13, 2021 at 6:54 PM Jeremy Schneider <schneider@ardentperf.com> wrote: > only a half GB memory for autovac? (it will have a mandatory run as soon > as you hit 200 mil XIDs, seems like you'd want the full max 1GB for it) While anti-wraparound vacuums will become a problem for TPC-C (unless you tune for it), it's not too sensitive to mwm. You just don't end up accumulating too many TIDs to delete from indexes in practice, even though the overhead from VACUUM is a concern. The new autovacuum instrumentation in Postgres 14 makes this far clearer. -- Peter Geoghegan
Select pg_stat_activity
On Tuesday, October 12, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-10-12 at 13:05 +0530, Ashkil Dighin wrote:
> Perf data for 24vu(TPC-C)
> --------------------------------
>
> 18.99% postgres postgres [.] LWLockAcquire
> 7.09% postgres postgres [.] _bt_compare
> 8.66% postgres postgres [.] LWLockRelease
> 2.28% postgres postgres [.] GetSnapshotData
> 2.25% postgres postgres [.] hash_search_with_hash_value
> 2.11% postgres postgres [.] XLogInsertRecord
> 1.98% postgres postgres [.] PinBuffer
>
> 1.Is there a way to tune the lock contention ?
How many concurrent sesions are you running?
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Thu, 2021-10-14 at 11:33 +0530, Ashkil Dighin wrote: > Captured the concurrent session with Netsat and pg-stat-actvity. Is the procedure the right way to capture concurrent sesssionsin postgresql? > > Select pg_stat_activity [some two dozen sessions] That doesn't look like you would get into trouble just from the sheer number of sessions, so it must be something else. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
NUMA node0 CPU(s): 0-63,128-191NUMA node1 CPU(s): 64-127,192-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 2
NUMA node(s): 2
corepinning(ta perf lock contention results for 24,32 vu
0-63
24: 18.03% postgres postgres [.] LWLockAcquire
32: 7.02% postgres postgres [.] LWLockAcquire
64-127
24: 17.96% postgres postgres [.] LWLockAcquire
32: 7.04% postgres postgres [.] LWLockAcquire
0-63,128-191(Node0)
24: 18.4% postgres postgres [.] LWLockAcquire
32: 7.07% postgres postgres [.] LWLockAcquire
64-127,192-255(Node1)
24: 18.3% postgres postgres [.] LWLockAcquire
32: 7.06% postgres postgres [.] LWLockAcquireI do not understand on interconnect type and has restrictions on lscpu .
On Tuesday, October 12, 2021, Mikhail Zhilin <zhilin.ms@phystech.edu> wrote:
Hi,
How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?
If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.
Thanks,
Michael.On 10/12/21 10:35 AM, Ashkil Dighin wrote:Hi,Lock contention observed high in PostgreSQLv13.3The source code compiled with GNC(GCCv11.x)PostgreSQL version: 13.3
Operating system: RHEL8.3Kernel name:4.18.0-305.10.2.el8_4.x86_64 RAM Size:512GBSSD: 1TBThe environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C
Perf data for 24vu(TPC-C)
--------------------------------
18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf
Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)effective_cachesize=392 GB(1/3 or 75% of RAM size)huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64Best RegardsAnil
On Thursday, October 14, 2021, Paul Friedman <paul.friedman@streetlightdata.com> wrote:
Are you using PostGIS?If so, there is an issue with TOAST table locking having these symptoms.---PaulOn Wed, Oct 13, 2021 at 11:15 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:1.Is there a way to tune the lock contention ? 2.Is any recommendations to tune/reduce the lock contention via postgres.confI think you'd want to find *which* LW locks are being waited on, to see if it's something that can be easily tuned. You can check pg_stat_activity, or maybe create a cronjob to record its content for later analysis. Hello,Also turn on log_lock_waits so you can evaluate the actual SQL causing the problems in the PG log files. Thinking ahead, you may want to consider if using advisory locks from the application side of things might be helpful to manage locks in a more pessimistic way. Also, join with pg_locks table to find out the specific resources that are in contention.
Regards,
Michael Vitale
Can you bind postgres in single NUMA node, for instance:
$ taskset -pc 0-63 <POSTMASTER_PID>
Then run your benchmark, compare results in terms of benchmark metrics & presence on LWLock(Acquire|Release) in perf top.
BR,
Michael.
NUMA node0 CPU(s): 0-63,128-191NUMA node1 CPU(s): 64-127,192-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 2
NUMA node(s): 2
corepinning(ta perf lock contention results for 24,32 vu
0-63
24: 18.03% postgres postgres [.] LWLockAcquire
32: 7.02% postgres postgres [.] LWLockAcquire
64-127
24: 17.96% postgres postgres [.] LWLockAcquire
32: 7.04% postgres postgres [.] LWLockAcquire
0-63,128-191(Node0)
24: 18.4% postgres postgres [.] LWLockAcquire
32: 7.07% postgres postgres [.] LWLockAcquire
64-127,192-255(Node1)
24: 18.3% postgres postgres [.] LWLockAcquire
32: 7.06% postgres postgres [.] LWLockAcquire I do not understand on interconnect type and has restrictions on lscpu .
On Tuesday, October 12, 2021, Mikhail Zhilin <zhilin.ms@phystech.edu> wrote:Hi,
How many sockets are on motherboard?
What is CPU model and interconnect type (UPI?)?
Can you share output of "lscpu"?
If you have more than 1 NUMA node it may be worth to run PostgreSQL in single NUMA node via taskset. It will eliminate access to remote memory and speed up processing.
Thanks,
Michael.On 10/12/21 10:35 AM, Ashkil Dighin wrote:Hi,Lock contention observed high in PostgreSQLv13.3The source code compiled with GNC(GCCv11.x)PostgreSQL version: 13.3
Operating system: RHEL8.3Kernel name:4.18.0-305.10.2.el8_4.x86_64 RAM Size:512GBSSD: 1TBThe environment used IBM metal and test benchmark environment HammerDbv4.2
Test case :TPC-C
Perf data for 24vu(TPC-C)
--------------------------------
18.99% postgres postgres [.] LWLockAcquire
7.09% postgres postgres [.] _bt_compare
8.66% postgres postgres [.] LWLockRelease
2.28% postgres postgres [.] GetSnapshotData
2.25% postgres postgres [.] hash_search_with_hash_value
2.11% postgres postgres [.] XLogInsertRecord
1.98% postgres postgres [.] PinBuffer1.Is there a way to tune the lock contention ?
2.Is any recommendations to tune/reduce the lock contention via postgres.conf
Postgres.conf used in Baremetal
========================
shared_buffers = 128GB(1/4 th RAM size)effective_cachesize=392 GB(1/3 or 75% of RAM size)huge_pages = on
temp_buffers = 4000MB
work_mem = 4000MB
maintenance_work_mem = 512MB
autovacuum_work_mem = -1
max_stack_depth = 7MB
dynamic_shared_memory_type = posix
max_files_per_process = 4000
effective_io_concurrency = 32
wal_level = minimal
synchronous_commit = off
wal_buffers = 512MB
checkpoint_timeout = 1h
checkpoint_completion_target = 1
checkpoint_warning = 0
log_min_messages = error
log_min_error_statement = error
log_timezone = 'GB'
autovacuum = off
datestyle = 'iso, dmy'
timezone = 'GB'
lc_messages = 'en_GB.UTF-8'
lc_monetary = 'en_GB.UTF-8'
lc_numeric = 'en_GB.UTF-8'
lc_time = 'en_GB.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64Best RegardsAnil
Hi
B-tree index used in the postgres environment
Checked on warehouse different values like 100,800,1600,2400 and 3200 with virtual user 64
On different values(warehouse) the lock contention same i.e. approx 17% and iostat usage is 30-40%
pg_Count_ware=100
-----------------
17.76% postgres postgres [.] LWLockAcquire
4.88% postgres postgres [.] _bt_compare
3.10% postgres postgres [.] LWLockRelease
pg_Count_ware=800(previously I used Warehouse 800)
--------------------------------------------
17.91% postgres postgres [.] LWLockAcquire
5.76% postgres postgres [.] _bt_compare
3.06% postgres postgres [.] LWLockRelease
pg_Count_ware_1600
-----------------
17.80% postgres postgres [.] LWLockAcquire
5.88% postgres postgres [.] _bt_compare
2.70% postgres postgres [.] LWLockRelease
pg_Count_ware_2400
------------------
17.77% postgres postgres [.] LWLockAcquire
6.01% postgres postgres [.] _bt_compare
2.71% postgres postgres [.] LWLockRelease
pg_Count_ware_3200
------------------
17.46% postgres postgres [.] LWLockAcquire
6.32% postgres postgres [.] _bt_compare
2.86% postgres postgres [.] hash_search_with_hash_value
1.Tired different values of lock management values in postgres.conf but it not helped to reduce lock contention.
deadlock_timeout = 5s
max_locks_per_transaction = 64
max_pred_locks_per_transaction = 64
max_pred_locks_per_relation = -2
max_pred_locks_per_page = 2
2.Intention to check the postgreSQL scalability and performance or throughput(TPC-C/TPC-H)
with HammerDB and pgbench with server configuration on tune settings(postgresql.conf)-reduce lock contention
CPU's :256
Threadper core: 2
Core per socket: 64
Sockets: 2
NUMA node0 : 0-63,128-191
NUMA node1 : 64-127,192-255
RAM size :512GB
SSD :1TB
Ref link:https://www.hammerdb.com/blog/uncategorized/hammerdb-best-practice-for-postgresql-performance-and-scalability/
On Thursday, October 14, 2021, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Oct 12, 2021 at 12:45 AM Ashkil Dighin <ashkildighin76@gmail.com> wrote:
> Lock contention observed high in PostgreSQLv13.3
> The source code compiled with GNC(GCCv11.x)
> PostgreSQL version: 13.3
> Operating system: RHEL8.3
> Kernel name:4.18.0-305.10.2.el8_4.x86_64
> RAM Size:512GB
> SSD: 1TB
> The environment used IBM metal and test benchmark environment HammerDbv4.2
> Test case :TPC-C
You didn't say how many TPC-C warehouses you used. In my experience,
people sometimes run TPC-C with relatively few, which will tend to
result in extreme contention on certain B-Tree leaf pages. (My
experiences are with BenchmarkSQL, but I can't imagine HammerDB is too
much different.)
Assuming that's the case here, for you, then it's not clear that you
have a real problem. You're really not supposed to run the benchmark
in that way, per the TPC-C spec, which strictly limits the number of
transactions per minute per warehouse -- for better or worse, valid
results generally require that you use lots of warehouses to get a
very large database (think terabytes). If you run the benchmark with
100 warehouses or less, on a big server, then the contention you'll
see will be out of all proportion to what you're ever likely to see in
the real world.
--
Peter Geoghegan
Hi, On 2021-10-12 13:05:12 +0530, Ashkil Dighin wrote: > PostgreSQL version: 13.3 You could try postgres 14 - that did improve scalability in some areas. > Perf data for 24vu(TPC-C) > -------------------------------- > > 18.99% postgres postgres [.] LWLockAcquire > 7.09% postgres postgres [.] _bt_compare > 8.66% postgres postgres [.] LWLockRelease > 2.28% postgres postgres [.] GetSnapshotData > 2.25% postgres postgres [.] hash_search_with_hash_value > 2.11% postgres postgres [.] XLogInsertRecord > 1.98% postgres postgres [.] PinBuffer To be more useful you'd need to create a profile with 'caller' information using 'perf record --call-graph dwarf', and then check what the important callers are. > Postgres.conf used in Baremetal > ======================== > shared_buffers = 128GB(1/4 th RAM size) > effective_cachesize=392 GB(1/3 or 75% of RAM size) If your hot data set is actually larger than s_b, I'd recommend trying a larger s_b. It's plausible that a good chunk of lock contention is from that. Greetings, Andres Freund
Hi, On 2021-10-25 18:38:40 -0600, Michael Lewis wrote: > On Mon, Oct 25, 2021, 5:36 PM Andres Freund <andres@anarazel.de> wrote: > If your hot data set is actually larger than s_b, I'd recommend trying a > larger s_b. It's plausible that a good chunk of lock contention is from > that. > How much larger might you go? I've seen s_b in the ~700GB range being a considerable speedup over lower values quite a few years ago. I don't see a clear cut upper boundary. The one thing this can regress measurably is the speed of dropping / truncating tables. > Any write ups on lock contention as it relates to shared buffers? I don't have a concrete thing to point you to, but searching for NUM_BUFFER_PARTITIONS might point you to some discussions. > How impactful might huge pages (off, transparent or on) be to the use of > shared buffers and the related locking mechanism? Using huge pages can *hugely* help performance-wise. Not directly by relieving postgres-side contention however (it does reduce cache usage somewhat, but it's mainly really just the frequency of TLB misses that makes the difference). Greetings, Andres Freund
5.09% postgres postgres [.] _bt_compare
On Tuesday, October 26, 2021, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2021-10-12 13:05:12 +0530, Ashkil Dighin wrote:
> PostgreSQL version: 13.3
You could try postgres 14 - that did improve scalability in some areas.
> Perf data for 24vu(TPC-C)
> --------------------------------
>
> 18.99% postgres postgres [.] LWLockAcquire
> 7.09% postgres postgres [.] _bt_compare
> 8.66% postgres postgres [.] LWLockRelease
> 2.28% postgres postgres [.] GetSnapshotData
> 2.25% postgres postgres [.] hash_search_with_hash_value
> 2.11% postgres postgres [.] XLogInsertRecord
> 1.98% postgres postgres [.] PinBuffer
To be more useful you'd need to create a profile with 'caller' information
using 'perf record --call-graph dwarf', and then check what the important
callers are.
> Postgres.conf used in Baremetal
> ========================
> shared_buffers = 128GB(1/4 th RAM size)
> effective_cachesize=392 GB(1/3 or 75% of RAM size)
If your hot data set is actually larger than s_b, I'd recommend trying a
larger s_b. It's plausible that a good chunk of lock contention is from that.
Greetings,
Andres Freund
Hi, On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote: >Hi, >Yes, lock contention reduced with postgresqlv14. >Lock acquire reduced 18% to 10% >10.49 %postgres postgres [.] LWLockAcquire >5.09% postgres postgres [.] _bt_compare > >Is lock contention can be reduced to 0-3%? Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g.cache misses, serializing store buffers,...). There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'dguess. >On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert” Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much.. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:
Hi,
On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?
Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).
There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.
>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”
Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Ashkil Dighin <ashkildighin76@gmail.com> writes: > I suspect lock contention and performance issues with __int128. And I would > like to check the performance by forcibly disabling int128(Maxalign16bytes) > and enable like long long(maxlign 8bytes). > Is it possible to disable int128 in PostgreSQL? Sure, you can build without it --- easiest way would be to modify pg_config.h after the configure step. But the idea that it has something to do with lock contention seems like nonsense. regards, tom lane
HiI suspect lock contention and performance issues with __int128. And I would like to check the performance by forcibly disabling int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).Is it possible to disable int128 in PostgreSQL?
On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:Hi,
On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?
Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).
There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.
>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”
Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Wednesday, November 17, 2021, arjun shetty <arjunshetty955@gmail.com> wrote:
Hi AskhilPostgreSQL utilizes lightweight locks(LWLocks) to synchronize and control access to the buffer content. A process acquires an LWLock in a shared mode to read from the buffer and an exclusive mode to write to the buffer. Therefore, while holding an exclusive lock, a process prevents other processes from acquiring a shared or exclusive lock. Also, a shared lock can be acquired concurrently by other processes. The issue starts when many processes acquire an exclusive lock on buffer content. As a result, LwlockAcquire seen as top hot function in profilng.Here need to understand LwlockAcquire is lock contention or cpu time spent inside the method/ function(top function in profiling)It can analysed log “LwStatus” with parameters like ex-acquire-count(exclusive mode) , sh-acquire-count , block-count and spin-delay-countTotal lock acquisition request = ex-acquire-count+sh-acquire-count) Time lock contention %= block count)/ Total lock acquisition request.Time lock contention may provide as most of cpu time inside the function rather than spinning/ waiting for lock.On Friday, November 12, 2021, Ashkil Dighin <ashkildighin76@gmail.com> wrote:HiI suspect lock contention and performance issues with __int128. And I would like to check the performance by forcibly disabling int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).Is it possible to disable int128 in PostgreSQL?
On Thursday, October 28, 2021, Andres Freund <andres@anarazel.de> wrote:Hi,
On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <ashkildighin76@gmail.com> wrote:
>Hi,
>Yes, lock contention reduced with postgresqlv14.
>Lock acquire reduced 18% to 10%
>10.49 %postgres postgres [.] LWLockAcquire
>5.09% postgres postgres [.] _bt_compare
>
>Is lock contention can be reduced to 0-3%?
Probably not, or at least not easily. Because of the atomic instructions the locking also includes some other costs (e.g. cache misses, serializing store buffers,...).
There's a good bit we can do to increase the cache efficiency around buffer headers, but it won't get us quite that low I'd guess.
>On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”
Without knowing what proportion they have to each and to non-waiting backends that unfortunately doesn't help that much..
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.