Re: heavily contended lwlocks with long wait queues scale badly - Mailing list pgsql-hackers

From Andres Freund
Subject Re: heavily contended lwlocks with long wait queues scale badly
Date
Msg-id 20221101174123.5scswkltxnjozknk@awork3.anarazel.de
Whole thread Raw
In response to Re: heavily contended lwlocks with long wait queues scale badly  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Responses Re: heavily contended lwlocks with long wait queues scale badly
List pgsql-hackers
Hi,

On 2022-11-01 11:19:02 -0400, Jonathan S. Katz wrote:
> This is the type of fix that would make headlines in a major release
> announcement (10x TPS improvement w/4096 connections?!). That is also part
> of the tradeoff of backpatching this, is that we may lose some of the higher
> visibility marketing opportunities to discuss this (though I'm sure there
> will be plenty of blog posts, etc.)

(read the next paragraph with the caveat that results below prove it somewhat
wrong)

I don't think the fix is as big a deal as the above make it sound - you need
to do somewhat extreme things to hit the problem. Yes, it drastically improves
the scalability of e.g. doing SELECT txid_current() across as many sessions as
possible - but that's not something you normally do (it was a good candidate
to show the problem because it's a single lock but doesn't trigger WAL flushes
at commit).

You can probably hit the problem with many concurrent single-tx INSERTs, but
you'd need to have synchronous_commit=off or fsync=off (or a very expensive
server class SSD with battery backup) and the effect is likely smaller.


> Andres: when you suggested backpatching, were you thinking of the Nov 2022
> release or the Feb 2023 release?

I wasn't thinking that concretely. Even if we decide to backpatch, I'd be very
hesitant to do it in a few days.


<goes and runs test while in meeting>


I tested with browser etc running, so this is plenty noisy. I used the best of
the two pgbench -T21 -P5 tps, after ignoring the first two periods (they're
too noisy). I used an ok-ish NVMe SSD, rather than the the expensive one that
has "free" fsync.

synchronous_commit=on:

clients   master         fix
16          6196            6202
64         25716           25545
256        90131           90240
1024      128556          151487
2048       59417          157050
4096       32252          178823


synchronous_commit=off:

clients   master         fix
16        409828      409016
64        454257          455804
256       304175          452160
1024      135081          334979
2048       66124          291582
4096       27019          245701


Hm. That's a bigger effect than I anticipated. I guess sc=off isn't actually
required, due to the level of concurrency making group commit very
effective.

This is without an index, serial column or anything. But a quick comparison
for just 4096 clients shows that to still be a big difference if I create an
serial primary key:
master: 26172
fix: 155813


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Next
From: Jacob Champion
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert