Re: problems on Solaris - Mailing list pgsql-hackers

From Andres Freund
Subject Re: problems on Solaris
Date
Msg-id 20150624124210.GN4797@alap3.anarazel.de
Whole thread Raw
In response to Re: problems on Solaris  (Andres Freund <andres@anarazel.de>)
Responses Re: problems on Solaris  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2015-05-31 01:09:18 +0200, Andres Freund wrote:
> On 2015-05-27 21:23:34 -0400, Robert Haas wrote:
> > > Oh wow, that's bad, and could explain a couple of the problems we're
> > > seing. One possible way to fix is to replace the sequence with if
> > > (!TAS(spin)) S_UNLOCK();. But that'd mean TAS() has to be a barrier,
> > > even if the lock isn't free - which e.g. isn't the case for PowerPC's
> > > implementation :(
> > 
> > Another possibility is to make the fallback barrier implementation a
> > system call, like maybe kill(PostmasterPid, 0).
> 
> It's not necessarily true that all system calls are effective
> barriers. I'm e.g. doubtful that kill(..., 0) is one as it only performs
> local error checking. It might be that the process existance check
> includes a lock that's sufficient, but I would not like to rely on
> it. Sending an actual signal probably would be, but has the potential of
> disrupting postmaster progress.

I thought about various other syscalls we could use, and your proposal
seems to be least worst. My idea of using waitpid() falls short because
it only works for child processes.  I think the kind of systems that we
don't have barriers on, are unlikely to use complex stuff like RCU to
manage access to process hierarchies.

I reproduced the 'stuck' issue on x86 by #ifdef'ing out barrier support
- about 50% of the time test_shm_mq gets stuck. Replacing it with
kill(PostmasterPid, 0) "works". Unless somebody protests soon that's
what I'm going to commit. It surely is better than easily reproducible
hangs.

I'm wondering wether we should add a #warning to atomic.c if either the
fallback memory or compiler barrier is used? Might be annoying to people
using -Werror, but I doubt that's possible anyway on such old systems.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Uriy Zhuravlev
Date:
Subject: Re: WIP: Enhanced ALTER OPERATOR
Next
From: Kohei KaiGai
Date:
Subject: Re: Foreign join pushdown vs EvalPlanQual