Re: Spinning verses sleeping in s_lock - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Spinning verses sleeping in s_lock
Date
Msg-id 26565.1010346039@sss.pgh.pa.us
Whole thread Raw
In response to Spinning verses sleeping in s_lock  (mlw <markw@mohawksoft.com>)
Responses Re: Spinning verses sleeping in s_lock  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
mlw <markw@mohawksoft.com> writes:
> What is the advantage of spinning?

> On a uniprocessor box, there is "never" any advantage because you must
> release the CPU in order to allow the process which owns the lock to
> run long enough to release it.

Actually, that analysis is faulty.

Suppose that we didn't use a select(), yield(), or anything, but just
spun until we got the lock.  On a uniprocessor this would imply spinning
until the end of our current timeslice, whereupon the CPU would be given
to someone else.  When control finally comes back to our process, more
than likely the holder of the spinlock has been allowed to run and has
finished his use of the spinlock, and we will be able to get it.  Thus,
in this scenario we waste the rest of our timeslice, but we are able
to proceed as soon as the scheduler next wishes to give us control.

If "the rest of our timeslice" is not long, this might actually be
better than an immediate delay via select() or usleep(), because those
won't allow us to run for X number of milliseconds.  It's entirely
possible that the CPU cycles we "save" by not spinning will end up being
spent in the kernel's idle loop, unless the overall load is so high that
the CPU never runs out of things to do.  I doubt this approach would be
a winner on average, but to claim that it never wins is wrong.

Now if we have available a sched_yield() type of primitive, that lets
us give up the processor without committing to any minimum delay before
we can be rescheduled, then I think it's usually true that on a
uniprocessor we might as well do sched_yield after just a single failure
of TAS().  (However, on some architectures such as Alpha even that much
is wrong; TAS() can "fail" for reasons that do not mean the spinlock is
locked.  So the most portable thing is to try TAS at least a few times,
even on a uniprocessor.)

> On an SMP box, this is a bit more complicated. If you have two CPUs, then
> maybe, one process can spin, but obviously, more than one spinner is wasting
> CPU, one of the spinners must release its time slice in order for another
> process release the resource.

I don't believe that either, not completely.  A process that is spinning
but hasn't got a CPU is not wasting cycles; it's essentially done an
implicit sched_yield, and as we just saw that is the most efficient way
of doing things.  What you are really trying to say is that it's
inefficient for *all* the currently executing processes to be spinning
on a lock that none of them hold.  This is true but there's no direct
way for us to determine (in userland code) that that condition holds.
If you try to maintain a userland counter of the number of actively
spinning processes, you will fail because a process might lose the CPU
while it has the counter incremented.  (There are also a whole bunch
of portability problems associated with trying to build an atomically
incrementable/decrementable counter, bearing in mind that the counter
can't itself be protected by a spinlock.)  Nor can you directly
determine whether the current holder of the spinlock is actively
running or not.  The most effective indirect test of that condition
is really to spin for awhile and see if the lock gets released.

A final comment --- given that in 7.2 we only use spinlocks to protect
very short segments of code, I believe it's fairly improbable for more
than two processes to be contending for a spinlock anyway.  So it's
probably sufficient to distinguish whether we have one or more than
one CPU, and statically select one of two spinning strategies on that
basis.  Trying to dynamically adapt for more CPUs/contending processes
will reap only minimal returns.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: anoncvs
Next
From: Bruce Momjian
Date:
Subject: Re: pgbench -i spends all its time doing CHECKPOINT