Re: Spinning verses sleeping in s_lock - Mailing list pgsql-hackers

From mlw
Subject Re: Spinning verses sleeping in s_lock
Date
Msg-id 3C38B2B0.1148CDEE@mohawksoft.com
Whole thread Raw
In response to Spinning verses sleeping in s_lock  (mlw <markw@mohawksoft.com>)
Responses Re: Spinning verses sleeping in s_lock  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> 
> mlw <markw@mohawksoft.com> writes:
> > What is the advantage of spinning?
> 
> > On a uniprocessor box, there is "never" any advantage because you must
> > release the CPU in order to allow the process which owns the lock to
> > run long enough to release it.
> 
> Actually, that analysis is faulty.
> 
> Suppose that we didn't use a select(), yield(), or anything, but just
> spun until we got the lock.  On a uniprocessor this would imply spinning
> until the end of our current timeslice, whereupon the CPU would be given
> to someone else.  When control finally comes back to our process, more
> than likely the holder of the spinlock has been allowed to run and has
> finished his use of the spinlock, and we will be able to get it.  Thus,
> in this scenario we waste the rest of our timeslice, but we are able
> to proceed as soon as the scheduler next wishes to give us control.

On average, we would have to assume wasting half of a time slice. Yes, if we
are at the end of our time slice, spinning may be a winner, but on average, it
will be half our time slice.

This depends on the OS and scheduler and the duration of the time slice.

> 
> If "the rest of our timeslice" is not long, this might actually be
> better than an immediate delay via select() or usleep(), because those
> won't allow us to run for X number of milliseconds.  It's entirely
> possible that the CPU cycles we "save" by not spinning will end up being
> spent in the kernel's idle loop, unless the overall load is so high that
> the CPU never runs out of things to do.  I doubt this approach would be
> a winner on average, but to claim that it never wins is wrong.

Oh, I may have over stated my position, but on average, it would absolutely be
a loser.

> 
> Now if we have available a sched_yield() type of primitive, that lets
> us give up the processor without committing to any minimum delay before
> we can be rescheduled, then I think it's usually true that on a
> uniprocessor we might as well do sched_yield after just a single failure
> of TAS().  (However, on some architectures such as Alpha even that much
> is wrong; TAS() can "fail" for reasons that do not mean the spinlock is
> locked.  So the most portable thing is to try TAS at least a few times,
> even on a uniprocessor.)

Two things: 
(1) Why doesn't Postgresql use sched_yield() instead of select() in s_lock()?
There must be a way wrap that stuff in a macro in a port/os.h file.

(2) TAS fails when it should work on the alpha? Is it known why? 

> 
> > On an SMP box, this is a bit more complicated. If you have two CPUs, then
> > maybe, one process can spin, but obviously, more than one spinner is wasting
> > CPU, one of the spinners must release its time slice in order for another
> > process release the resource.
> 
> I don't believe that either, not completely.  A process that is spinning
> but hasn't got a CPU is not wasting cycles; it's essentially done an
> implicit sched_yield, and as we just saw that is the most efficient way
> of doing things.  What you are really trying to say is that it's
> inefficient for *all* the currently executing processes to be spinning
> on a lock that none of them hold.  This is true but there's no direct
> way for us to determine (in userland code) that that condition holds.

SMP has a lot of issues that are not completely obvious. I really don't like
spinlocks in userland. In an active server, wasting CPU cycles is just wrong.

> If you try to maintain a userland counter of the number of actively
> spinning processes, you will fail because a process might lose the CPU
> while it has the counter incremented.  (There are also a whole bunch
> of portability problems associated with trying to build an atomically
> incrementable/decrementable counter, bearing in mind that the counter
> can't itself be protected by a spinlock.)  Nor can you directly
> determine whether the current holder of the spinlock is actively
> running or not.  The most effective indirect test of that condition
> is really to spin for awhile and see if the lock gets released.

Any userland strategy to mitigate wasted CPU cycles has problems. The very
nature of the problem is OS related. Should PostgreSQL have a more OS specific
implementation in a port section of code?

> 
> A final comment --- given that in 7.2 we only use spinlocks to protect
> very short segments of code, I believe it's fairly improbable for more
> than two processes to be contending for a spinlock anyway.  So it's
> probably sufficient to distinguish whether we have one or more than
> one CPU, and statically select one of two spinning strategies on that
> basis.  Trying to dynamically adapt for more CPUs/contending processes
> will reap only minimal returns.

That is really one of those funny things about SMP stuff, you are probably
right, but sometimes little things really surprise.


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pgbench -i spends all its time doing CHECKPOINT
Next
From: Tom Lane
Date:
Subject: Effects of pgbench "scale factor"