Thread: LWLock Queue Jumping

LWLock Queue Jumping

From
Simon Riggs
Date:
WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

Some callers of WALInsertLock are more important than others

* Writing new Clog or Multixact pages (serialized by ClogControlLock)
* For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)

In these cases it seems like we can skip straight to the front of the
WALInsertLock queue without problem.

Most other items cannot be safely reordered, possibly no other items.

We already re-order the lock queues when we hold shared locks, so we
know in principle it is OK to do so. This is an extension of that
thought.

Implementing this would do much to remove my objection to performance
issues associated with simplifying the Hot Standby patch, as recently
suggested by Heikki.

Possible? If so, we can discuss implementation. No worries if not, but
just a side thought that may be fruitful.

-- Simon Riggs           www.2ndQuadrant.com



Re: LWLock Queue Jumping

From
Greg Stark
Date:
On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:
> WALInsertLock is heavily contended and likely always will be even if we
> apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

> Some callers of WALInsertLock are more important than others
>
> * Writing new Clog or Multixact pages (serialized by ClogControlLock)
> * For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)
>
> In these cases it seems like we can skip straight to the front of the
> WALInsertLock queue without problem.

How does re-ordering reduce the contention? We reorder shared lockers
ahead of exclusive lockers because they can all hold the lock at the
same time so we can reduce the amount of time the lock is held.

Reordering some exclusive lockers ahead of other exclusive lockers
won't reduce the amount of time the lock is held at all. Are you
saying the reason to do it is to reduce time spent waiting on this
lock while holding other critical locks? Do we have tools to measure
how long is being spent waiting on one lock while holding another lock
so we can see if there's a problem and whether this helps?

-- 
greg
http://mit.edu/~gsstark/resume.pdf


Re: LWLock Queue Jumping

From
Heikki Linnakangas
Date:
Greg Stark wrote:
> On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:
>> WALInsertLock is heavily contended and likely always will be even if we
>> apply some of the planned fixes.
> 
> I've lost any earlier messages, could you resend the raw data on which
> this is based?

I don't have any pointers right now, but WALInsertLock does often show
up as a bottleneck in write-intensive benchmarks.

>> Some callers of WALInsertLock are more important than others
>>
>> * Writing new Clog or Multixact pages (serialized by ClogControlLock)
>> * For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)
>>
>> In these cases it seems like we can skip straight to the front of the
>> WALInsertLock queue without problem.
> 
> Reordering some exclusive lockers ahead of other exclusive lockers
> won't reduce the amount of time the lock is held at all. Are you
> saying the reason to do it is to reduce time spent waiting on this
> lock while holding other critical locks?

That's what I thought. I don't know about the clog/multixact issue, it
doesn't seem like it would be too bad, given how seldom new clog or
multixact pages are written.

The Hot Standby thing has been discussed, but no-one has actually posted
a patch which does the locking correctly, where the ProcArrayLock is
held while the SnapshotData WAL record is inserted. So there is no
evidence that it's actually a problem, we might be making a mountain out
of a molehill. It will have practically no effect on throughput, given
how seldom SnapshotData records are written (once per checkpoint), but
if it causes a significant bump to response times, that might be a problem.

This is a good idea to keep in mind, but right now it feels like a
solution in search of a problem.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: LWLock Queue Jumping

From
Stefan Kaltenbrunner
Date:
Heikki Linnakangas wrote:
> Greg Stark wrote:
>> On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:
>>> WALInsertLock is heavily contended and likely always will be even if we
>>> apply some of the planned fixes.
>> I've lost any earlier messages, could you resend the raw data on which
>> this is based?
> 
> I don't have any pointers right now, but WALInsertLock does often show
> up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY 
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php


and (iirc) also here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php


however the general issue is easily visible in almost any write 
intensive concurrent workload on a fast IO subsystem(ie 
pgbench,sysbench,...).


Stefan


Re: LWLock Queue Jumping

From
Simon Riggs
Date:
On Sun, 2009-08-30 at 09:03 +0300, Heikki Linnakangas wrote:

> The Hot Standby thing has been discussed, but no-one has actually posted
> a patch which does the locking correctly, where the ProcArrayLock is
> held while the SnapshotData WAL record is inserted. So there is no
> evidence that it's actually a problem, we might be making a mountain out
> of a molehill. It will have practically no effect on throughput, given
> how seldom SnapshotData records are written (once per checkpoint), but
> if it causes a significant bump to response times, that might be a problem.
> 
> This is a good idea to keep in mind, but right now it feels like a
> solution in search of a problem.

The most important thing is to get HS committed and to do that I think
it is important that I show you I am willing to respond to review
comments. So I will implement it the way you propose and defer any
further discussion about lock contention. The idea here is a simple fix
and very easy enough to return to later, if we need it.

-- Simon Riggs           www.2ndQuadrant.com