Re: condition variables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: condition variables
Date
Msg-id CA+TgmobxxQG3dCL_y2Xv3KDhCoMNpy7Bmw05g2pWhuE9Zu0Pmw@mail.gmail.com
Whole thread Raw
In response to Re: condition variables  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Thu, Aug 11, 2016 at 8:44 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> In contrast, this proposal leaves it up to client code to get that
> right, similarly to the way you need to do things in a certain order
> when waiting for state changes with latches.  You could say that it's
> more error prone: I think there have been a few cases of incorrectly
> coded latch/state-change wait loops in the past.  On the other hand,
> it places no requirements on the synchronisation mechanism the client
> code uses for the related shared state.  pthread_cond_wait requires
> you to pass in a pointer to the related pthread_mutex_t, whereas with
> this proposal client code is free to use atomic ops, lwlocks,
> spinlocks or any other mutual exclusion mechanism to coordinate state
> changes and deal with cache coherency.

I think you have accurately stated the pros and cons of this approach.
On the whole I think it's a good trade-off.  In particular, not being
locked into a specific synchronization method seems like a good idea
from here.  If you had to pick just one, it would be hard to decide
between spinlocks and LWLocks, and "atomics" isn't a sufficiently
specific thing to code to.

> Then there is the question of what happens when the backend that is
> supposed to be doing the signalling dies or aborts, which Tom Lane
> referred to in his reply.  In those other libraries there is no such
> concern: it's understood that these are low level thread
> synchronisation primitives and if you're waiting for something that
> never happens, you'll be waiting forever.  I don't know what the
> answer is in general for Postgres condition variables, but...

As I noted in my reply to Tom, for parallel query, we're going to kill
all the workers at the same time, so the problem doesn't arise.  When
we use this mechanism outside that context, we just have to do it
correctly.  I don't think that's especially hard, but could somebody
mess it up?  Sure.

> The thing that I personally am working on currently that is very
> closely related and could use this has a more specific set of
> circumstances:  I want "join points" AKA barriers.  Something like
> pthread_barrier_t.  (I'm saying "join point" rather than "barrier" to
> avoid confusion with compiler and memory barriers, barrier.h etc.)
> Join points let you wait for all workers in a known set to reach a
> given point, possibly with a phase number or at least sense (one bit
> phase counter) to detect synchronisation bugs.  They also select one
> worker arbitrarily to receive a different return value when releasing
> workers from a join point, for cases where a particular phase of
> parallel work needs to be done by exactly one worker while the others
> sit on the bench: for example initialisation, cleanup or merging (CF
> PTHREAD_BARRIER_SERIAL_THREAD).  Clearly a join point could be not
> much more than a condition variable and some state tracking arrivals
> and departures, but I think that this higher level synchronisation
> primitive might have an advantage over raw condition variables in the
> abort case: it can know the total set of workers that its waiting for,
> if they are somehow registered with it first, and registration can
> include arranging for cleanup hooks to do the right thing.  It's
> already a requirement for a join point to know which workers exist (or
> at least how many).  Then the deal would then be that when you call
> joinpoint_join(&some_joinpoint, phase), it will return only when all
> peers have joined or detached, where the latter happens automatically
> if they abort or die.  Not at all sure of the details yet...  but I
> suspect join points are useful for a bunch of things like parallel
> sort, parallel hash join (my project), and anything else involving
> phases or some form of "fork/join" parallelism.

If I'm right that the abort/die case doesn't really need any special
handling here, then I think this gets a lot simpler.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: condition variables
Next
From: Jim Nasby
Date:
Subject: Re: new autovacuum criterion for visible pages