Thread: LWLock Queue Jumping

LWLock Queue Jumping

From
Jeff Janes
Date:
---------- Forwarded message ----------
From: Simon Riggs <simon@2ndQuadrant.com>
To: pgsql-hackers <pgsql-hackers@postgresql.org>
Date: Fri, 28 Aug 2009 20:07:32 +0100
Subject: LWLock Queue Jumping

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

Some callers of WALInsertLock are more important than others

* Writing new Clog or Multixact pages (serialized by ClogControlLock)
* For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)

In these cases it seems like we can skip straight to the front of the
WALInsertLock queue without problem.

Most other items cannot be safely reordered, possibly no other items.

We already re-order the lock queues when we hold shared locks, so we
know in principle it is OK to do so. This is an extension of that
thought.

Implementing this would do much to remove my objection to performance
issues associated with simplifying the Hot Standby patch, as recently
suggested by Heikki.

Possible? If so, we can discuss implementation. No worries if not, but
just a side thought that may be fruitful.

I'd previously implemented this just by copying and pasting and making some changes, perhaps not the most desirable way but I thought adding another parameter to all existing invocations would be a bit excessive.  The attached patch will convert the existing LWLockAcquire into LWLockAcquire_head, rather than adding a new function.  Sorry if that is not the optimal way to send this, I wanted to make it easy to see just the changes, even though the functions aren't technically the same thing anymore.

I've tested it fairly thoroughly, in the context of using it in AdvanceXLInsertBuffer for acquiring the WALWriteLock.

Jeff

Attachment

Re: LWLock Queue Jumping

From
Simon Riggs
Date:
On Fri, 2009-08-28 at 14:44 -0700, Jeff Janes wrote:

> I'd previously implemented this just by copying and pasting and making
> some changes, perhaps not the most desirable way but I thought adding
> another parameter to all existing invocations would be a bit
> excessive.

That's the way I would implement it also, but would call it
LWLockAcquireWithPriority() so that it's purpose is clear, rather than
refer to its implementation, which may change.

> I've tested it fairly thoroughly, 

Please send the tested patch, if this isn't it. What tests were made?

> in the context of using it in AdvanceXLInsertBuffer for acquiring the
> WALWriteLock.

Apologies if you'd already suggested that recently. I read a few of your
posts but not all of them. 

I don't think WALWriteLock from AdvanceXLInsertBuffer is an important
area, but I don't see any harm from it either.

-- Simon Riggs           www.2ndQuadrant.com



Re: LWLock Queue Jumping

From
Jeff Janes
Date:
On Sat, Aug 29, 2009 at 4:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Fri, 2009-08-28 at 14:44 -0700, Jeff Janes wrote:

> I'd previously implemented this just by copying and pasting and making
> some changes, perhaps not the most desirable way but I thought adding
> another parameter to all existing invocations would be a bit
> excessive.

That's the way I would implement it also, but would call it
LWLockAcquireWithPriority() so that it's purpose is clear, rather than
refer to its implementation, which may change.


Yes, good idea.  But thinking about it as a patch to be applied, rather than a proof of concept, I think the best solution would be to add a third argument (boolean Priority) to LWLockAquire, and hunt down all existing invocations and change them to include false as the extra argument.  Copying 160 lines of code to change 4 of them in the copy is temporarily easier, but not a good solution for the long term.


> I've tested it fairly thoroughly,

Please send the tested patch, if this isn't it. What tests were made?

I'd have a hard time coming up with the full origianl patch, as my changes for files other than lwlock.c were blown away by parallel efforts and an rsync to the repo.  The above was just an exploratory tool, not proposed as an actual patch to be applied to HEAD.  If we want to add a parameter to the existing LWLockAcquire, I'll work on coming up with a tested patch for that.  My testing was to run the normal regression test (which often failed to detect my early buggy implementations), then load testing with pgbench (which always (that I know of) found them when -c > 1) and a custom Perl script I use.  Since WALWriteLock is heavily used and contended under pgbench -c 20, and lwlock is agnostic to the exact identity of the underlying lock, I think this test was pretty thorough for the implementation.  But not of course for starvation issues, which would have to be tested on a case by case basis when a specific Acquire invocation is changed to be high priority.

If you have ideas for other tests to do, or corner cases that are likely to be overlooked by my tests, I'll try to work tests for them in too.
 


> in the context of using it in AdvanceXLInsertBuffer for acquiring the
> WALWriteLock.

Apologies if you'd already suggested that recently. I read a few of your
posts but not all of them.

I don't think WALWriteLock from AdvanceXLInsertBuffer is an important
area, but I don't see any harm from it either.


I had not mentioned it before.  The change helped by ~50% or so when wal_buffers was undersized (kept at the default setting) but did not help significantly when wal_buffers was generously sized.  I didn't think we would be interested in introducing a new locking procedure just to optimize performance for a poorly configured server.  But if we are to introduce this method for other reasons, I think it should be used for AdvanceXLInsertBuffer as well.
 
Cheers,

Jeff

Re: LWLock Queue Jumping

From
Jeff Janes
Date:
---------- Forwarded message ----------
From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>
Date: Sun, 30 Aug 2009 11:48:47 +0200
Subject: Re: LWLock Queue Jumping
Heikki Linnakangas wrote:
Greg Stark wrote:
On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:
WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.
I've lost any earlier messages, could you resend the raw data on which
this is based?

I don't have any pointers right now, but WALInsertLock does often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php


It looks like this is the bulk loading of data into unindexed tables.  How good is that as a target for optimization?  I can see several (quite difficult to code and maintain) ways to make bulk loading into unindexed tables faster, but they would not speed up the more general cases. 
 
and (iirc) also here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php


I played around a little with this, parallel bulk loads into a unindexed, very skinny table.  If I hacked XLogInsert so that it did nothing but take the WALInsertLock, release it, then return a fake RecPtr, it scaled better but still not very well.  So giant leaps in throughput would need to involve calling XLogInsert less often (or at least taking the WALInsertLock less often).  You could nibble around the edges by tweaking what happens under the WALInsertLock, but I don't think that that will get you big wins by doing that for this case.  But again, how important is this case?  Are bulk loads into skinny unindexed tables the best test-bed for improving XLogInsert?

(Sorry, I think I forgot to change the subject on previous message.  Digests are great if you only read, but for contributing I guess I have to change to receiving each message)

Jeff

Re: LWLock Queue Jumping

From
Stefan Kaltenbrunner
Date:
Jeff Janes wrote:
>     ---------- Forwarded message ----------
>     From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
>     To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
>     <mailto:heikki.linnakangas@enterprisedb.com>>
>     Date: Sun, 30 Aug 2009 11:48:47 +0200
>     Subject: Re: LWLock Queue Jumping
>     Heikki Linnakangas wrote:
> 
>         Greg Stark wrote:
> 
>             On Fri, Aug 28, 2009 at 8:07 PM, Simon
>             Riggs<simon@2ndquadrant.com <mailto:simon@2ndquadrant.com>>
>             wrote:
> 
>                 WALInsertLock is heavily contended and likely always
>                 will be even if we
>                 apply some of the planned fixes.
> 
>             I've lost any earlier messages, could you resend the raw
>             data on which
>             this is based?
> 
> 
>         I don't have any pointers right now, but WALInsertLock does
>         often show
>         up as a bottleneck in write-intensive benchmarks.
> 
> 
>     yeah I recently ran accross that issue with testing concurrent COPY
>     performance:
> 
>     http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
>     discussed here:
> 
>     http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php
> 
> 
> 
> It looks like this is the bulk loading of data into unindexed tables.  
> How good is that as a target for optimization?  I can see several (quite 
> difficult to code and maintain) ways to make bulk loading into unindexed 
> tables faster, but they would not speed up the more general cases. 

well bulk loading into unindexed tables is quite a common workload - 
apart from dump/restore cycles (which we can now do in parallel) a lot 
of analytic workloads are that way.
Import tons of data from various sources every night/weeek/month, index, 
analyze & aggregate, drop again.


>  
> 
>     and (iirc) also here:
> 
>     http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php
> 
> 
> 
> I played around a little with this, parallel bulk loads into a 
> unindexed, very skinny table.  If I hacked XLogInsert so that it did 
> nothing but take the WALInsertLock, release it, then return a fake 
> RecPtr, it scaled better but still not very well.  So giant leaps in 
> throughput would need to involve calling XLogInsert less often (or at 
> least taking the WALInsertLock less often).  You could nibble around the 
> edges by tweaking what happens under the WALInsertLock, but I don't 
> think that that will get you big wins by doing that for this case.  But 
> again, how important is this case?  Are bulk loads into skinny unindexed 
> tables the best test-bed for improving XLogInsert?

well you can get similiar looking profiles from other workloads (say 
pgbench) as well. Pretty sure the archives have examples for those as well..


Stefan


Re: LWLock Queue Jumping

From
Jeff Janes
Date:
On Sun, Aug 30, 2009 at 11:01 AM, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
Jeff Janes wrote:
   ---------- Forwarded message ----------
   From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
   To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
   <mailto:heikki.linnakangas@enterprisedb.com>>
   Date: Sun, 30 Aug 2009 11:48:47 +0200
   Subject: Re: LWLock Queue Jumping
   Heikki Linnakangas wrote:


       I don't have any pointers right now, but WALInsertLock does
       often show
       up as a bottleneck in write-intensive benchmarks.


   yeah I recently ran accross that issue with testing concurrent COPY
   performance:

   http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
   discussed here:

   http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php


It looks like this is the bulk loading of data into unindexed tables.  How good is that as a target for optimization?  I can see several (quite difficult to code and maintain) ways to make bulk loading into unindexed tables faster, but they would not speed up the more general cases.

well bulk loading into unindexed tables is quite a common workload - apart from dump/restore cycles (which we can now do in parallel) a lot of analytic workloads are that way.
Import tons of data from various sources every night/weeek/month, index, analyze & aggregate, drop again.

In those cases where you end by dropping the tables, we should be willing to bypass WAL altogether, right?  Is the problem we can bypass WAL (by doing the COPY in the same transaction that created or truncated the table), or we can COPY in parallel, but we can't do both simultaneously?


Jeff

Re: LWLock Queue Jumping

From
Stefan Kaltenbrunner
Date:
Jeff Janes wrote:
> On Sun, Aug 30, 2009 at 11:01 AM, Stefan Kaltenbrunner 
> <stefan@kaltenbrunner.cc> wrote:
> 
>     Jeff Janes wrote:
> 
>            ---------- Forwarded message ----------
>            From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
>            To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
>         <mailto:heikki.linnakangas@enterprisedb.com>
>            <mailto:heikki.linnakangas@enterprisedb.com
>         <mailto:heikki.linnakangas@enterprisedb.com>>>
>            Date: Sun, 30 Aug 2009 11:48:47 +0200
>            Subject: Re: LWLock Queue Jumping
>            Heikki Linnakangas wrote:
> 
> 
>                I don't have any pointers right now, but WALInsertLock does
>                often show
>                up as a bottleneck in write-intensive benchmarks.
> 
> 
>            yeah I recently ran accross that issue with testing
>         concurrent COPY
>            performance:
> 
>          
>          http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
>            discussed here:
> 
>            http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php
> 
> 
>         It looks like this is the bulk loading of data into unindexed
>         tables.  How good is that as a target for optimization?  I can
>         see several (quite difficult to code and maintain) ways to make
>         bulk loading into unindexed tables faster, but they would not
>         speed up the more general cases.
> 
> 
>     well bulk loading into unindexed tables is quite a common workload -
>     apart from dump/restore cycles (which we can now do in parallel) a
>     lot of analytic workloads are that way.
>     Import tons of data from various sources every night/weeek/month,
>     index, analyze & aggregate, drop again.
> 
> 
> In those cases where you end by dropping the tables, we should be 
> willing to bypass WAL altogether, right?  Is the problem we can bypass 
> WAL (by doing the COPY in the same transaction that created or truncated 
> the table), or we can COPY in parallel, but we can't do both simultaneously?

well yes that is part of the problem - if you bulk load into one or few 
tables concurrently you can only sometimes make use of the WAL bypass 
optimization. This is especially interesting if you consider that COPY 
alone is more or less CPU bottlenecked these days so using multiple 
cores makes sense to get higher load rates.


Stefan