Re: WALInsertLock contention - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: WALInsertLock contention
Date
Msg-id BANLkTi=HpHuVi1sQbAVC8Bpv1rpo-BzAMA@mail.gmail.com
Whole thread Raw
In response to Re: WALInsertLock contention  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: WALInsertLock contention
List pgsql-hackers
On Wed, Jun 8, 2011 at 10:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 8, 2011 at 6:49 PM, Jim Nasby <jim@nasby.net> wrote:
>>> If backend A needs to evict a buffer with a fake LSN, it can go find
>>> the WAL that needs to be serialized, do that, flush WAL, and then
>>> evict the buffer.
>>
>> Isn't the only time that you'd need to evict if you ran out of buffers?
>
> Sure, but that happens all the time.  See pg_stat_bgwriter.buffers_backend.
>
>> If the buffer was truly private, would that still be an issue?
>
> I'm not sure if you mean make the buffer private or make the WAL
> storage arena private, but I'm pretty well convinced that neither one
> can work.

You're probably right.  I think though there is enough hypothetical
upside to the private buffer case that it should be attempted just to
see what breaks. The major tricky bit is dealing with the new
pin/unpin mechanics.  I'd like to give it the 'college try'. (being
typically vain and attention seeking, this is right up my alley) :-D.

>> Perhaps the only way to make that work is multiple WAL streams, as was originally suggested...

If this was an easy way out all high performance file systems would
have multiple journals which you could write to concurrently (which
they don't afaik).

> Maybe...  but I hope not.  I just found an academic paper on this
> subject, about which I will post shortly.

I'm thinking that as long as your transactions have to be rigidly
ordered you have a fundamental bottleneck you can't really work
around.  One way to maybe get around this is to try and work out on
the fly if transaction 'A' functionally independent from transaction
'B' -- maybe then you could try and write them concurrently to
pre-allocated space on the log, or to separate logs maintained for
that purpose.  Good luck with that...even if you could somehow get it
to work, you would still have degenerate cases (like, 99% of real
world cases) to contend with.

Another thing you could try is to keep separate logs for rigidly
ordered data (commits, xlog switch, etc) and non rigidly ordered data
(everything else). On the non rigidly ordered side, you can
pre-allocate log space and write to it.  This is more or less a third
potential route (#1 and #2 being the shared/private buffer approaches)
of leveraging the fact that some of the data does not have to be
rigidly ordered.  Ultimately though even that could only get you so
far, because it incurs other costs and even contention on the lock for
inserting the commit records could start to bottleneck you.

If something useful squirts out of academia, I'd sure like to see it :-).

merlin


pgsql-hackers by date:

Previous
From: Euler Taveira de Oliveira
Date:
Subject: Re: tuning autovacuum
Next
From: Robert Haas
Date:
Subject: literature on write-ahead logging