Re: Double partition lock in bufmgr - Mailing list pgsql-hackers

From Yura Sokolov
Subject Re: Double partition lock in bufmgr
Date
Msg-id eceb1625108f07fedb729f1caaa8d5936b90ee6f.camel@postgrespro.ru
Whole thread Raw
In response to Double partition lock in bufmgr  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
В Пт, 18/12/2020 в 15:20 +0300, Konstantin Knizhnik пишет:
> Hi hackers,
> 
> I am investigating incident with one of out customers: performance of 
> the system isdropped dramatically.
> Stack traces of all backends can be found here: 
> http://www.garret.ru/diag_20201217_102056.stacks_59644
> (this file is 6Mb so I have not attached it to this mail).
> 
> What I have see in this stack traces is that 642 backends and blocked
> in 
> LWLockAcquire,
> mostly in obtaining shared buffer lock:
> 
> #0  0x00007f0e7fe7a087 in semop () from /lib64/libc.so.6
> #1  0x0000000000682fb1 in PGSemaphoreLock 
> (sema=sema@entry=0x7f0e1c1f63a0) at pg_sema.c:387
> #2  0x00000000006ed60b in LWLockAcquire (lock=lock@entry=0x7e8b6176d80
> 0, 
> mode=mode@entry=LW_SHARED) at lwlock.c:1338
> #3  0x00000000006c88a7 in BufferAlloc (foundPtr=0x7ffcc3c8de9b
> "\001", 
> strategy=0x0, blockNum=997, forkNum=MAIN_FORKNUM, relpersistence=112 
> 'p', smgr=0x2fb2df8) at bufmgr.c:1177
> #4  ReadBuffer_common (smgr=0x2fb2df8, relpersistence=<optimized
> out>, 
> relkind=<optimized out>, forkNum=forkNum@entry=MAIN_FORKNUM, 
> blockNum=blockNum@entry=997, mode=RBM_NORMAL, strategy=0x0, 
> hit=hit@entry=0x7ffcc3c8df97 "") at bufmgr.c:894
> #5  0x00000000006c928b in ReadBufferExtended (reln=0x32c7ed0, 
> forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=997, 
> mode=mode@entry=RBM_NORMAL, strategy=strategy@entry=0x0) at
> bufmgr.c:753
> #6  0x00000000006c93ab in ReadBuffer (blockNum=<optimized out>, 
> reln=<optimized out>) at bufmgr.c:685
> ...
> 
> Only 11 locks from this 642 are unique.
> Moreover: 358 backends are waiting for one lock and 183 - for another.
> 
> There are two backends (pids 291121 and 285927) which are trying to 
> obtain exclusive lock while already holding another exclusive lock.
> And them block all other backends.
> 
> This is single place in bufmgr (and in postgres) where process tries
> to 
> lock two buffers:
> 
>          /*
>           * To change the association of a valid buffer, we'll need to
> have
>           * exclusive lock on both the old and new mapping partitions.
>           */
>          if (oldFlags & BM_TAG_VALID)
>          {
>              ...
>              /*
>               * Must lock the lower-numbered partition first to avoid
>               * deadlocks.
>               */
>              if (oldPartitionLock < newPartitionLock)
>              {
>                  LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
>                  LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
>              }
>              else if (oldPartitionLock > newPartitionLock)
>              {
>                  LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
>                  LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
>              }
> 
> This two backends are blocked in the second lock request.
> I read all connects in bufmgr.c and README file but didn't find 
> explanation why do we need to lock both partitions.
> Why it is not possible first free old buffer (as it is done in 
> InvalidateBuffer) and then repeat attempt to allocate the buffer?
> 
> Yes, it may require more efforts than just "gabbing" the buffer.
> But in this case there is no need to keep two locks.
> 
> I wonder if somebody in the past  faced with the similar symptoms and 
> was this problem with holding locks of two partitions in bufmgr
> already 
> discussed?

Looks like there is no real need for this double lock. And the change to
consequitive lock acquisition really provides scalability gain:
https://bit.ly/3AytNoN

regards
Sokolov Yura
y.sokolov@postgrespro.ru
funny.falcon@gmail.com




pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: RFC: compression dictionaries for JSONB
Next
From: Maksim Milyutin
Date:
Subject: Re: Add client connection check during the execution of the query