Re: [pgsql-hackers] Daily digest v1.9430 (16 messages) - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [pgsql-hackers] Daily digest v1.9430 (16 messages)
Date
Msg-id f67928030908301009j470f1961me34aedc351cb4269@mail.gmail.com
Whole thread Raw
List pgsql-hackers
---------- Forwarded message ----------
From: Greg Stark <gsstark@mit.edu>
To: Simon Riggs <simon@2ndquadrant.com>
Date: Sun, 30 Aug 2009 00:28:14 +0100
Subject: Re: LWLock Queue Jumping
On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:
> WALInsertLock is heavily contended and likely always will be even if we
> apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

> Some callers of WALInsertLock are more important than others
>
> * Writing new Clog or Multixact pages (serialized by ClogControlLock)
> * For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)
>
> In these cases it seems like we can skip straight to the front of the
> WALInsertLock queue without problem.

How does re-ordering reduce the contention?


If you hold one contended lock while waiting in a FIFO for another contended lock, you just made the first lock that much more contended.  Jumping queue on the WALInsertLock
probably does not reduce contention on WALInsertLock, but does prevent that contention from introducing derivative contention on other locks which are already held while waiting (ProcArrayLock and ClogControlLock)
 

We reorder shared lockers
ahead of exclusive lockers because they can all hold the lock at

We don't explicitly reorder shared lockers ahead of exclusive lockers.  "The reordering" works for either case.  A holder of a shared lock who drops the lock but then grabs it again before the awakened exclusive waiter has to chance to grab it is "reordered", but so is the exclusive holder who drops a lock and then grabs it again before any of the awakened shared waiters have had a chance to grab it.  The primary point is not to reorder the locks, but to avoid excessive context switches.

 
the
same time so we can reduce the amount of time the lock is held.

Reordering some exclusive lockers ahead of other exclusive lockers
won't reduce the amount of time the lock is held at all. Are you
saying the reason to do it is to reduce time spent waiting on this
lock while holding other critical locks? Do we have tools to measure
how long is being spent waiting on one lock while holding another lock
so we can see if there's a problem and whether this helps?

I don't know of any formal tools for that.  I just add elog statements at strategic places and then mine the logfile.  You have to be careful that the time spent doing the logging doesn't distort the timings too much, but I usually haven't found that to be a problem.  I've toyed with changes to LWLOCK_STATS to help, but you have to focus on a few specific locks or else the overhead is generally too high, and if you are interested in only a handful of locks adding elog statements seems easier.  But I was looking in other areas of the code, not this specific area under discussion.

Jeff


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: clang's static checker report.
Next
From: Grzegorz Jaskiewicz
Date:
Subject: Re: clang's static checker report.