Re: pg_memory_barrier() doesn't compile, let alone work, for me - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_memory_barrier() doesn't compile, let alone work, for me
Date
Msg-id 8067.1374094649@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_memory_barrier() doesn't compile, let alone work, for me  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_memory_barrier() doesn't compile, let alone work, for me
Re: pg_memory_barrier() doesn't compile, let alone work, for me
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Jul 14, 2013 at 8:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, fixing that doesn't yield much joy; initdb stalls and then
>> crashes with
>> 
>> PANIC:  stuck spinlock (40054a88) detected at xlog.c:2182
>> 
>> The reason for that is that the code does not bother to initialize
>> "dummy_spinlock" anywhere.  It might accidentally fail to fail
>> on machines where the unlocked state of a spinlock is all-zeroes
>> (given a compiler that's not picky about the incorrect macro usage)
>> ... but HPPA is not such a machine.

> This would not be hard to fix, I think.

Really?  Given that the memory barrier primitives are supposed to be,
well, primitive, I don't think this is exactly a trivial problem.
There's no good place to initialize such a variable, and there's even
less of a place to make sure that fork or exec leaves it in an
appropriate state in the child process.

>> Rather than trying to think of a fix for that, I'm of the opinion that
>> we should rip this out.  The fallback implementation of pg_memory_barrier
>> ought to be pg_compiler_barrier(), on the theory that non-mainstream
>> architectures don't have weak memory ordering anyway, or if they do you
>> need to do some work to get PG to work on them.  Or maybe we ought to
>> stop pretending that the code is likely to work on arbitrary machines,
>> and just #error if there's not a supplied machine-specific macro.

> Well, pg_memory_barrier() isn't even equivalent to
> pg_compiler_barrier() on x86, which has among the strongest memory
> orderings out there, so I think your first idea is a non-starter.

Among the strongest memory orderings compared to what?  Since what we're
discussing here is non-mainstream architectures, I think this claim is
unfounded.  Most of the ones I can think of offhand are old enough to
not even have multiprocessor support, so that the issue is vacuous.

> I'm pretty sure we've got latent memory-ordering risks in our existing
> code which we just haven't detected and fixed yet.  Consider, for
> example, this exciting code from GetNewTransactionId:

>                                 myproc->subxids.xids[nxids] = xid;
>                                 mypgxact->nxids = nxids + 1;

> I don't believe that's technically safe even on an architecture like
> x86, because the compiler could decide to reorder those assignments.

Wrong, because both pointers are marked volatile.  If the compiler does
reorder the stores, it's broken.  Admittedly, this doesn't say anything
about hardware reordering :-(

> My preference would be to fix this in a narrow way, by initializing
> dummy_spinlock somewhere.  But if not, then I think #error is the only
> safe way to go.

I'm inclined to agree that #error is the only realistic answer in
general, though we could probably go ahead with equating
pg_memory_barrier to pg_compiler_barrier on specific architectures we
know are single-processor-only.  Unfortunately, that means we just
raised the bar for porting efforts significantly.  And in particular,
it means somebody had better go through s_lock.h and make sure we have a
credible barrier implementation for every single arch+compiler supported
therein.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: Proposal: template-ify (binary) extensions
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])