Re: heavily contended lwlocks with long wait queues scale badly - Mailing list pgsql-hackers

From Jonathan S. Katz
Subject Re: heavily contended lwlocks with long wait queues scale badly
Date
Msg-id bb410d95-91d2-4c84-986c-7009d5477dd0@postgresql.org
Whole thread Raw
In response to Re: heavily contended lwlocks with long wait queues scale badly  (Michael Paquier <michael@paquier.xyz>)
Responses Re: heavily contended lwlocks with long wait queues scale badly  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On 1/16/24 1:11 AM, Michael Paquier wrote:
> On Thu, Jan 11, 2024 at 09:47:33AM -0500, Jonathan S. Katz wrote:
>> I have similar data sources to Nathan/Michael and I'm trying to avoid piling
>> on, but one case that's interesting occurred after a major version upgrade
>> from PG10 to PG14 on a database supporting a very active/highly concurrent
>> workload. On inspection, it seems like backpatching would help this
>> particularly case.
>>
>> With 10/11 EOL, I do wonder if we'll see more of these reports on upgrade to
>> < PG16.
>>
>> (I was in favor of backpatching prior; opinion is unchanged).
> 
> Hearing nothing, I have prepared a set of patches for v12~v15,
> checking all the lwlock paths for all the branches.  At the end the
> set of changes look rather sane to me regarding the queue handlings.
> 
> I have also run some numbers on all the branches, and the test case
> posted upthread falls off dramatically after 512 concurrent
> connections at the top of all the stable branches :(
> 
> For example on REL_12_STABLE with and without the patch attached:
> num  v12           v12+patch
> 1    29717.151665  29096.707588
> 2    63257.709301  61889.476318
> 4    127921.873393 124575.901330
> 8    231400.571662 230562.725174
> 16   343911.185351 312432.897015
> 32   291748.985280 281011.787701
> 64   268998.728648 269975.605115
> 128  297332.597018 286449.176950
> 256  243902.817657 240559.122309
> 512  190069.602270 194510.718508
> 768  58915.650225  165714.707198
> 1024 39920.950552  149433.836901
> 2048 16922.391688  108164.301054
> 4096 6229.063321   69032.338708
> 
> I'd like to apply that, just let me know if you have any comments
> and/or objections.

Wow. All I can say is that my opinion remains unchanged on going forward 
with backpatching.

Looking at the code, I understand an argument for not backpatching given 
we modify the struct, but this does seem low-risk/high-reward and should 
help PostgreSQL to run better on this higher throughput workloads.

Thanks,

Jonathan

Attachment

pgsql-hackers by date:

Previous
From: Yongtao Huang
Date:
Subject: Re: Fix a typo of func DecodeInsert()
Next
From: Peter Smith
Date:
Subject: Re: Synchronizing slots from primary to standby