Re: Atomics hardware support table & supported architectures - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Atomics hardware support table & supported architectures
Date
Msg-id 20140617175509.GB3115@awork2.anarazel.de
Whole thread Raw
In response to Re: Atomics hardware support table & supported architectures  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Atomics hardware support table & supported architectures  (Kevin Grittner <kgrittn@ymail.com>)
Re: Atomics hardware support table & supported architectures  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2014-06-17 13:14:26 -0400, Robert Haas wrote:
> On Sat, Jun 14, 2014 at 9:12 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > At this year developer's meeting we'd discussed the atomics abstraction
> > which is necessary for some future improvements. We'd concluded that a
> > overview over the hardware capabilities of the supported platforms would
> > be helpful. I've started with that at:
> > https://wiki.postgresql.org/wiki/Atomics
> 
> That looks like a great start.  It could use a key explaining what the
> columns mean.  Here are some guesses, and some comments.
> 
> single-copy r/w atomicity = For what word sizes does this platform
> have the ability to do atomic reads and writes of values?

Yes. It's a term found in literature, I'm happy to replace it by
something else. It's essentially defined to mean that a read after one
or more writes will only be influenced by one of the last writes, not a
mixture of them. I.e. whether there ever will be intermediate states
visible.

> I don't
> understand how "1" can ever NOT be supported - does any architecture
> write data to memory a nibble at a time?

Actually there seem to be some where that's possible for some operations
(VAX). But the concern is more whether 1 byte can actually be written
without also writing neighbouring values. I.e. there's hardware out
there that'll implement a 1byte store as reading 4 bytes, changing one
of the bytes in a register, and then write the 4 bytes out again. Which
would mean that a neighbouring write will possibly cause a wrong value
to be written...

> I also don't really
> understand the *** footnote; how can you do kernel emulation of an
> atomic load or store.

What happens is that gcc will do a syscall triggering the kernel to turn
of scheduling; perform the math and store the result; turn scheduling on
again. That way there cannot be a other operation influencing the
calculation/store. Imagine if you have hardware that, internally, only
does stores in 4 byte units. Even if it's a single CPU machine, which
most of those are, the kernel could schedule a separate process after
the first 4bytes have been written. Oops. The kernel has ways to prevent
that, userspace doesn't...

> TAS = Does this platform support test and set?  Items in parentheses
> are the instructions that implement this.

> cmpxchg = Does this platform support compare and swap?  If so, for
> what word sizes?  I assume "n" means "not supported at all" and "y"
> means "supported but we don't know for which word sizes".  Maybe call
> this column "CAS" or "Compare and Swap" rather than using an
> Intel-ism.

Ok.

> gcc support from version = Does this mean the version from which GCC
> supported the architecture, or from which it supported atomics (which
> ones?) on that architecture?

It means from which version on gcc has support for the __sync intrinsics
on that platform. I've only added versions where support for all atomics
of the supported sizes were available.
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Once these answers satisfy you I'll adapt the wiki.

> > Does somebody want other columns in there?
> 
> I think the main question at the developer meeting was how far we want
> to go with supporting primitives like atomic add, atomic and, atomic
> or, etc.  So I think we should add columns for those.

Well, once CAS is available, atomic add etc is all trivially
implementable - without further hardware support. It might be more
efficient to use the native instruction (e.g. xadd can be much better
than a cmpxchg loop because there's no retries), but that's just
optimization that won't matter unless you have a fair bit of
concurrency.

There's currently fallbacks like:
#ifndef PG_HAS_ATOMIC_FETCH_ADD_U32
#define PG_HAS_ATOMIC_FETCH_ADD_U32
STATIC_IF_INLINE uint32
pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 add_)
{uint32 old;while (true){    old = pg_atomic_read_u32_impl(ptr);    if (pg_atomic_compare_exchange_u32_impl(ptr, &old,
old+ add_))        break;}return old;
 
}

> > 3) sparcv8: Last released model 1997.
> 
> I seem to recall hearing about this in a customer situation relatively
> recently, so there may be a few of these still kicking around out
> there.

Really? As I'd written in a reply solaris 10 (released 2005) dropped
support for it. Dropping support for a platform that's been desupported
10 years ago by it's manufacturer doesn't sound bad imo...

> > 4) i386: Support dropped from windows 98 (yes, really), linux, openbsd
> >    (yes, really), netbsd (yes, really). No code changes needed.
> 
> Wow, OK.  In that case, yeah, let's dump it.  But let's make sure we
> adequately document that someplace in the code comments, along with
> the reasons, because not everyone may realize how dead it is.

I'm generally wondering how to better document the supported os/platform
combinations. E.g. it's not apparent that we only support certain
platforms on a rather limited set of compilers...

Maybe a table with columns like: platform, platform version,
supported-OSs, supported-compilers?

> > 6) armv-v5
> 
> I think this is also a bit less dead than the other ones; Red Hat's
> shows Bugzilla shows people filing bugs for platform-specific problems
> as recently as January of 2013:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=892378

Closed as WONTFIX :P.

Joking aside, I think there are still usecases for arm-v5 - but it's
embedded stuff without a real OS and such. Nothing you'd install PG
on. There's distributions that are dropping ARMv6 support already... My
biggest problem is that it's not even documented whether v5 has atomic
4byte stores - while it's documted for v6.

> > Note that this is *not* a requirement for the atomics abstraction - it
> > now has a fallback to spinlocks if atomics aren't available.
> 
> That seems great.  Hopefully with a configure option to disable
> atomics so that it's easy to test the fallback.

It's a #define right now. Do you think we really need a configure
option?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: 9.5 CF1
Next
From: Andres Freund
Date:
Subject: Re: Minmax indexes