Re: lwlocks and starvation - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: lwlocks and starvation
Date
Msg-id 1101371321.4179.711.camel@localhost.localdomain
Whole thread Raw
In response to Re: lwlocks and starvation  (Neil Conway <neilc@samurai.com>)
Responses Re: lwlocks and starvation
List pgsql-hackers
On Wed, 2004-11-24 at 12:52, Neil Conway wrote:
> Bruce Momjian wrote:
> > I thought the new readers will sit after the writer in the FIFO queue so
> > the writer will not starve.
> 
> AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED 
> mode, we check if "exclusive" is zero; if so, we acquire the lock 
> (increment the shared lock count and do not block). And "exclusive" is 
> set non-zero only when we _acquire_ a lock in exclusive mode, not when 
> we add an exclusive waiter to the wait queue.

Wow...well spotted.

That could explain many recent performance results. 

On Wed, 2004-11-24 at 08:23, Neil Conway wrote: 
> LWLockRelease() currently does something like (simplifying a lot):
> 
>     acquire lwlock spinlock
>     decrement lock count
>     if lock is free
>       if first waiter in queue is waiting for exclusive lock,
>       awaken him; else, walk through the queue and awaken
>       all the shared waiters until we reach an exclusive waiter
>     end if
>     release lwlock spinlock
> 
> This has the nice property that locks are granted in FIFO order. Is it
> essential that we maintain that property? If not, we could instead walk
> through the wait queue and awaken *all* the shared waiters, and get a
> small improvement in throughput.

I'd been thinking about lock release order also, thinking that this
could be related to the CS storms observed earlier and the apparent
lock-step behaviour commented upon previously. FIFO is the most easily
theoretically predictable, but others are possible. ISTM that waking
shared waiters in xid order would bring the most benefit and minimise
any data issues. Readers waiting behind an exclusive waiter, where the
reader has a lower xid might reasonably be woken without a problem since
they will never see the changes made by the exclusive waiter anyway.
That probably needs to be within a limited window of inspection beyond
the exclusive waiter to limit the complexity, say 4-8 places beyond the
exclusive waiter.

Exactly what we do from here is going to dramatically effect performance
in various situations, so I think trying a few different algorithms
should help the understanding.

IMHO a concern remains that oprofile is not good enough instrumentation
to spot this kind of issue. Instrumentation at the lwlock level *would*
have spotted this and other issues too, and will also help us determine
what the differences are between the various ways forward for (possibly)
changing the current behaviour.

-- 
Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Solaris 8 regression test failure with 8.0.0beta5
Next
From: Richard Huxton
Date:
Subject: Re: Help!