Re: Proposal of tunable fix for scalability of 8.4 - Mailing list pgsql-performance

From Simon Riggs
Subject Re: Proposal of tunable fix for scalability of 8.4
Date
Msg-id 1237362518.3953.181.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Proposal of tunable fix for scalability of 8.4  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Sat, 2009-03-14 at 12:09 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> > WALInsertLock is also quite high on Jignesh's list. That I've seen
> > become the bottleneck on other tests too.
>
> Yeah, that's been seen to be an issue before.  I had the germ of an idea
> about how to fix that:
>
>     ... with no lock, determine size of WAL record ...
>     obtain WALInsertLock
>     identify WAL start address of my record, advance insert pointer
>         past record end
>     *release* WALInsertLock
>     without lock, copy record into the space just reserved
>
> The idea here is to allow parallelization of the copying of data into
> the buffers.  The hold time on WALInsertLock would be very short.  Maybe
> it could even become a spinlock, though I'm not sure, because the
> "advance insert pointer" bit is more complicated than it looks (you have
> to allow for the extra overhead when crossing a WAL page boundary).
>
> Now the fly in the ointment is that there would need to be some way to
> ensure that we didn't write data out to disk until it was valid; in
> particular how do we implement a request to flush WAL up to a particular
> LSN value, when maybe some of the records before that haven't been fully
> transferred into the buffers yet?  The best idea I've thought of so far
> is shared/exclusive locks on the individual WAL buffer pages, with the
> rather unusual behavior that writers of the page would take shared lock
> and only the reader (he who has to dump to disk) would take exclusive
> lock.  But maybe there's a better way.  Currently I don't believe that
> dumping a WAL buffer (WALWriteLock) blocks insertion of new WAL data,
> and it would be nice to preserve that property.

Yeh, that's just what we'd discussed previously:
http://markmail.org/message/gectqy3yzvjs2hru#query:Reworking%20WAL%
20locking+page:1+mid:gectqy3yzvjs2hru+state:results

Are you thinking of doing this for 8.4? :-)

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Extremely slow intarray index creation and inserts.
Next
From: Simon Riggs
Date:
Subject: Re: Proposal of tunable fix for scalability of 8.4