Double partition lock in bufmgr - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Double partition lock in bufmgr
Date
Msg-id f4f2af4b-246b-4409-0b8d-6b39da064175@postgrespro.ru
Whole thread Raw
Responses Re: Double partition lock in bufmgr
List pgsql-hackers
Hi hackers,

I am investigating incident with one of out customers: performance of 
the system isdropped dramatically.
Stack traces of all backends can be found here: 
http://www.garret.ru/diag_20201217_102056.stacks_59644
(this file is 6Mb so I have not attached it to this mail).

What I have see in this stack traces is that 642 backends and blocked in 
LWLockAcquire,
mostly in obtaining shared buffer lock:

#0  0x00007f0e7fe7a087 in semop () from /lib64/libc.so.6
#1  0x0000000000682fb1 in PGSemaphoreLock 
(sema=sema@entry=0x7f0e1c1f63a0) at pg_sema.c:387
#2  0x00000000006ed60b in LWLockAcquire (lock=lock@entry=0x7e8b6176d800, 
mode=mode@entry=LW_SHARED) at lwlock.c:1338
#3  0x00000000006c88a7 in BufferAlloc (foundPtr=0x7ffcc3c8de9b "\001", 
strategy=0x0, blockNum=997, forkNum=MAIN_FORKNUM, relpersistence=112 
'p', smgr=0x2fb2df8) at bufmgr.c:1177
#4  ReadBuffer_common (smgr=0x2fb2df8, relpersistence=<optimized out>, 
relkind=<optimized out>, forkNum=forkNum@entry=MAIN_FORKNUM, 
blockNum=blockNum@entry=997, mode=RBM_NORMAL, strategy=0x0, 
hit=hit@entry=0x7ffcc3c8df97 "") at bufmgr.c:894
#5  0x00000000006c928b in ReadBufferExtended (reln=0x32c7ed0, 
forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=997, 
mode=mode@entry=RBM_NORMAL, strategy=strategy@entry=0x0) at bufmgr.c:753
#6  0x00000000006c93ab in ReadBuffer (blockNum=<optimized out>, 
reln=<optimized out>) at bufmgr.c:685
...

Only 11 locks from this 642 are unique.
Moreover: 358 backends are waiting for one lock and 183 - for another.

There are two backends (pids 291121 and 285927) which are trying to 
obtain exclusive lock while already holding another exclusive lock.
And them block all other backends.

This is single place in bufmgr (and in postgres) where process tries to 
lock two buffers:

         /*
          * To change the association of a valid buffer, we'll need to have
          * exclusive lock on both the old and new mapping partitions.
          */
         if (oldFlags & BM_TAG_VALID)
         {
             ...
             /*
              * Must lock the lower-numbered partition first to avoid
              * deadlocks.
              */
             if (oldPartitionLock < newPartitionLock)
             {
                 LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
                 LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
             }
             else if (oldPartitionLock > newPartitionLock)
             {
                 LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
                 LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
             }

This two backends are blocked in the second lock request.
I read all connects in bufmgr.c and README file but didn't find 
explanation why do we need to lock both partitions.
Why it is not possible first free old buffer (as it is done in 
InvalidateBuffer) and then repeat attempt to allocate the buffer?

Yes, it may require more efforts than just "gabbing" the buffer.
But in this case there is no need to keep two locks.

I wonder if somebody in the past  faced with the similar symptoms and 
was this problem with holding locks of two partitions in bufmgr already 
discussed?

P.S.
The customer is using 9.6 version of Postgres, but I have checked that 
the same code fragment is present in the master.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: allow to \dtS+ pg_toast.*
Next
From: Peter Smith
Date:
Subject: Re: Single transaction in the tablesync worker?