Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

From Robert Haas
Subject Re: group locking: incomplete patch, just for discussion
Date
Msg-id CA+TgmoYGbeMmqEEVCEQi3fLk_1V089=jeKVEcQkkh5xuT2gGKw@mail.gmail.com
Whole thread Raw
In response to Re: group locking: incomplete patch, just for discussion  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: group locking: incomplete patch, just for discussion  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Oct 31, 2014 at 9:07 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Maybe we can, as a first step, make those edges in the lock graph
> visible to the deadlock detector? It's pretty clear that undetected
> deadlocks aren't ok, but detectable deadlocks in a couple corner cases
> might be acceptable.

An interesting point came to mind this morning on this topic, while
discussing this issue with Amit.  Suppose process A1 holds a lock on
some object and process A2, a member of the same locking group, waits
for a lock on that same object.  It follows that there must exist a
process B which either *holds* or *awaits* a lock which conflicts with
the one requested by A2; otherwise, A2 would have been granted the
lock at once.  If B *holds* the conflicting lock, it may eventually
release it; but if B *awaits* the conflicting lock, we have a soft
deadlock, because A2 waits for B waits for A1, which we presume to
wait for A1.[1]

The logical consequence of this is that, if a process seeks to acquire
a lock which is already held (in any mode) by a fellow group-member,
it had better jump over the entire wait queue and acquire the lock
immediately if no conflicting locks have already been granted.  If it
fails to do this, it immediately creates a soft deadlock.  What if
there *are* conflicting locks already granted?  In that case, things
are more complicated.  We could, for example, have this situation: A1
holds AccessShareLock, B holds ExclusiveLock, C1 awaits ShareLock, and
then (following it in the wait queue) A2 awaits RowExclusiveLock.
This is not a deadlock, because B can finish, then C can get the lock,
then C1 can finish, then A2 can get the lock.  But if C2 now waits for
some member of group A, then we have a soft deadlock between group A
and group C, which can be resolved by moving A2's request ahead of C1.
(This example is relatively realistic, too: a lock upgrade from
AccessShareLock to RowExclusiveLock is probably the most common type
of lock upgrade, for reasons that are hopefully obvious.)

In practice, I suspect it's best to take a more aggressive approach
and have A2 jump straight to the head of the line right out of the
chute.  Letting some parallel workers make progress while holding up
others out of a notion of locking fairness is probably stupid, because
they're probably dependent on each other in some way that makes it
useless for one to make progress while the others don't.  For the same
reason, I'd argue that we ought to wait until all pending lock
requests from a given group can be satisfied before granting any of
them.  In fact, I suspect it's best in general for the locking
machinery and the deadlock detector to view several simultaneous,
pending lock requests as if they were a single request conflicting
with everything that would conflict with any of the requested modes.
Consider the situation where there are 10 lock waiters, 5 in each of
two parallel groups.  The deadlock detector could spend a good deal of
time and energy trying various reorderings of the lock queue, but in
practice it seems highly unlikely that it's sensible to do anything
other than put all the locks from one group first and all of the locks
from the other group afterwards, regardless of whether the processes
want the same mode or different modes.  Most of the other orderings
will either be equivalent to one of those orderings (because commuting
two non-conflicting locks in the wait queue has no effect) or soft
deadlocks (because the interleave a conflicting lock request between
two requests form the same group).  And even if you can find some
corner case where neither of those things is the case, it's probably
still not useful to let half the lock group run and leave the other
half blocked.  What will probably happen is that some internal queue
will pretty quickly either fill up or drain completely and the process
filling it, or draining it, will wait.  Now we're no better off than
if we'd waited to grant the lock in the first place, and we're now
holding more locks that can block other things or cause deadlocks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1]  Here I'm assuming, as in previous discussion, that we regard all
members of a parallel group as mutually waiting for each other, on the
theory that the parallel computation won't complete until all workers
complete and, therefore, the members of the group are either already
waiting for each other or eventually will.  While this might not be
true in every case, I believe it's a good (and generally valid)
simplifying assumption.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [BUGS] ltree::text not immutable?
Next
From: Andrew Dunstan
Date:
Subject: Re: json, jsonb, and casts