Hi,
On 2022-11-01 11:19:02 -0400, Jonathan S. Katz wrote:
> This is the type of fix that would make headlines in a major release
> announcement (10x TPS improvement w/4096 connections?!). That is also part
> of the tradeoff of backpatching this, is that we may lose some of the higher
> visibility marketing opportunities to discuss this (though I'm sure there
> will be plenty of blog posts, etc.)
(read the next paragraph with the caveat that results below prove it somewhat
wrong)
I don't think the fix is as big a deal as the above make it sound - you need
to do somewhat extreme things to hit the problem. Yes, it drastically improves
the scalability of e.g. doing SELECT txid_current() across as many sessions as
possible - but that's not something you normally do (it was a good candidate
to show the problem because it's a single lock but doesn't trigger WAL flushes
at commit).
You can probably hit the problem with many concurrent single-tx INSERTs, but
you'd need to have synchronous_commit=off or fsync=off (or a very expensive
server class SSD with battery backup) and the effect is likely smaller.
> Andres: when you suggested backpatching, were you thinking of the Nov 2022
> release or the Feb 2023 release?
I wasn't thinking that concretely. Even if we decide to backpatch, I'd be very
hesitant to do it in a few days.
<goes and runs test while in meeting>
I tested with browser etc running, so this is plenty noisy. I used the best of
the two pgbench -T21 -P5 tps, after ignoring the first two periods (they're
too noisy). I used an ok-ish NVMe SSD, rather than the the expensive one that
has "free" fsync.
synchronous_commit=on:
clients master fix
16 6196 6202
64 25716 25545
256 90131 90240
1024 128556 151487
2048 59417 157050
4096 32252 178823
synchronous_commit=off:
clients master fix
16 409828 409016
64 454257 455804
256 304175 452160
1024 135081 334979
2048 66124 291582
4096 27019 245701
Hm. That's a bigger effect than I anticipated. I guess sc=off isn't actually
required, due to the level of concurrency making group commit very
effective.
This is without an index, serial column or anything. But a quick comparison
for just 4096 clients shows that to still be a big difference if I create an
serial primary key:
master: 26172
fix: 155813
Greetings,
Andres Freund