LWLock contention: I think I understand the problem - Mailing list pgsql-hackers

After some further experimentation, I believe I understand the reason for
the reports we've had of 7.2 producing heavy context-swap activity where
7.1 didn't.  Here is an extract from tracing lwlock activity for one
backend in a pgbench run:

2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): awakened
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): excl 1 shared 0 head 0x422c27d4
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): release waiter
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(300): excl 0 shared 0 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(300): excl 0 shared 1 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): excl 1 shared 0 head 0x422c2bfc
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): waiting
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): awakened
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): excl 1 shared 0 head 0x422c27d4
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): release waiter
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(232): excl 0 shared 0 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(232): excl 0 shared 1 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(300): excl 0 shared 0 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(300): excl 0 shared 1 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): excl 1 shared 0 head 0x422c2bfc
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): waiting
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): awakened
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): excl 1 shared 0 head 0x422c27d4
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): release waiter
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(232): excl 0 shared 0 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(232): excl 0 shared 1 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(300): excl 0 shared 0 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(300): excl 0 shared 1 head (nil)
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): excl 1 shared 0 head 0x422c2bfc
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): waiting
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockAcquire(0): awakened
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): excl 1 shared 0 head 0x422c27d4
2001-12-29 13:30:30 [31442]  DEBUG:  LWLockRelease(0): release waiter

LWLock 0 is the BufMgrLock, while the locks with numbers like 232 and
300 are context locks for individual buffers.  At the beginning of this
trace we see the process awoken after having been granted the
BufMgrLock.  It does a small amount of processing (probably a ReadBuffer
operation) and releases the BufMgrLock.  At that point, someone else is
already waiting for BufMgrLock, and the line about "release waiter"
means that ownership of BufMgrLock has been transferred to that other
someone.  Next, the context lock 300 is acquired and released (there's no
contention for it).  Next we need to get the BufMgrLock again (probably
to do a ReleaseBuffer).  Since we've already granted the BufMgrLock to
someone else, we are forced to block here.  When control comes back,
we do the ReleaseBuffer and then release the BufMgrLock --- again,
immediately granting it to someone else.  That guarantees that our next
attempt to acquire BufMgrLock will cause us to block.  The cycle repeats
for every attempt to lock BufMgrLock.

In essence, what we're seeing here is a "tag team" behavior: someone is
always waiting on the BufMgrLock, and so each LWLockRelease(BufMgrLock)
transfers lock ownership to someone else; then the next
LWLockAcquire(BufMgrLock) in the same process is guaranteed to block;
and that means we have a new waiter on BufMgrLock, so that the cycle
repeats.  Net result: a process context swap for *every* entry to the
buffer manager.

In previous versions, since BufMgrLock was only a spinlock, releasing it
did not cause ownership of the lock to be immediately transferred to
someone else.  Therefore, the releaser would be able to re-acquire the
lock if he wanted to do another bufmgr operation before his time quantum
expired.  This made for many fewer context swaps.

It would seem, therefore, that lwlock.c's behavior of immediately
granting the lock to released waiters is not such a good idea after all.
Perhaps we should release waiters but NOT grant them the lock; when they
get to run, they have to loop back, try to get the lock, and possibly go
back to sleep if they fail.  This apparent waste of cycles is actually
beneficial because it saves context swaps overall.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Updated date/time parsing
Next
From: Thomas Lockhart
Date:
Subject: Re: LWLock contention: I think I understand the problem