Re: better atomics - v0.5 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: better atomics - v0.5
Date
Msg-id 20140630174447.GL21422@awork2.anarazel.de
Whole thread Raw
In response to Re: better atomics - v0.5  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: better atomics - v0.5  (Andres Freund <andres@2ndquadrant.com>)
Re: better atomics - v0.5  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On 2014-06-30 13:15:23 -0400, Robert Haas wrote:
> On Mon, Jun 30, 2014 at 12:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The existing coding rules also discourage spinlocking within a spinlock,
> > and the reason for that is that there's no very clear upper bound to the
> > time required to obtain a spinlock, so that there would also be no clear
> > upper bound to the time you're holding the original one (thus possibly
> > leading to cascading performance problems).
> 
> Well, as Andres points out, commit
> daa7527afc2274432094ebe7ceb03aa41f916607 made that more of an ironclad
> requirement than a suggestion.  If it's sometimes OK to acquire a
> spinlock from within a spinlock, then that commit was badly misguided;
> but the design of that patch was pretty much driven by your insistence
> that that was never, ever OK.  If that was wrong, then we should
> reconsider that whole commit more broadly; but if it was right, then
> the atomics patch shouldn't drill a hole through it to install an
> exception just for emulating atomics.

Well, since nobody has a problem with the rule of not having nested
spinlocks and since the problems aren't the same I'm failing to see why
we should reconsider this.

My recollection was that we primarily discussed whether it'd be a good
idea to add code to Assert() when spinlocks are nested to which Tom
responded that it's not necessary because critical sections should be
short enough to immmediately see that that's not a problem. Then we
argued a bit back and forth around that.

> I'm personally not convinced that we're approaching this topic in the
> right way.  I'm not convinced that it's at all reasonable to try to
> emulate atomics on platforms that don't have them.  I would punt the
> problem into the next layer and force things like lwlock.c to have
> fallback implementations at that level that don't require atomics, and
> remove those fallback implementations if and when we move the
> goalposts so that all supported platforms must have working atomics
> implementations.

If we're requiring a atomics-less implementation for lwlock.c,
bufmgr. et al. I'll stop working on those features (and by extension on
atomics). It's an utter waste of resources and maintainability trying to
maintain parallel high level locking algorithms in several pieces of the
codebase. The nonatomic versions will stagnate and won't actually work
under load. I'd rather spend my time on something useful. The spinlock
based atomics emulation is *small*. It's easy to reason about
correctness there.

If we're going that way, we've made a couple of absolute fringe
platforms hold postgres, as actually used in reality, hostage. I think
that's insane.

> People who write code that uses atomics are not
> likely to think about how those algorithms will actually perform when
> those atomics are merely emulated, and I suspect that means that in
> practice platforms that have only emulated atomics are going to
> regress significantly vs. the status quo today.

I think you're overstating the likely performance penalty for
nonparallel platforms/workloads here quite a bit. The platforms without
changes of gettings atomics implemented aren't ones where that's a
likely thing. Yes, you'll see a regression if you run a readonly pgbench
over a 4 node NUMA system - but it's not large. You won't see much of
improved performance in comparison to 9.4, but I think that's primarily
it.

Also it's not like this is something new - the semaphore based fallback
*sucks* but we still have it.

*Many* features in new major versions regress performance for fringe
platforms. That's normal.

> If this patch goes in
> the way it's written right now, then I think it basically amounts to
> throwing platforms that don't have working atomics emulation under the
> bus

Why? Nobody is going to run 9.5 under high performance workloads on such
platforms. They'll be so IO and RAM starved that the spinlocks won't be
the bottleneck.

>, and my vote is that if we're going to do that, we should do it
> explicitly: sorry, we now only support platforms with working atomics.
> You don't have any, so you can't run PostgreSQL.  Have a nice day.

I'd be fine with that. I don't think we're gaining much by those
platforms.
*But* I don't really see why it's required at this point. The fallback
works and unless you're using semaphore based spinlocks the performance
isn't bad.
The lwlock code doesn't really get slower/faster in comparison to before
when the atomics implementation is backed by semaphores. It's pretty
much the same number of syscalls using the atomics/current implementation.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Does changing attribute order breaks something?
Next
From: Alvaro Herrera
Date:
Subject: Re: Set new system identifier using pg_resetxlog