Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: group locking: incomplete patch, just for discussion |
Date | |
Msg-id | 1415867935.2998.110.camel@jeff-desktop Whole thread Raw |
In response to | Re: group locking: incomplete patch, just for discussion (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: group locking: incomplete patch, just for discussion
|
List | pgsql-hackers |
Great discussion. Robert, I think you make a lot of very valid points, but philosophically I am in pretty strong agreement with Andres here. On Fri, 2014-10-31 at 14:29 -0400, Robert Haas wrote: > > So just refusing parallelism as soon as anything has taken an access > > exclusive lock doesn't sound too bad to me. > > That restriction seems onerous to me; for example, it would prevent a > parallel sort for CLUSTER or a parallel index rebuild for VACUUM FULL. > Those seems like worthy targets for parallelism, and maybe even early > targets, since I expect a lot of the real complexity here to be in the > query planner, and you can avoid most of that complexity when planning > DDL. If two backends both have an exclusive lock on the relation for a join operation, that implies that they need to do their own synchronization, because obviously the lock manager is not doing it for them. In turn, that implies that we need parallel versions of these operations; going to each one and parallelizing it. I don't think that's a good long-term strategy... it changes our simple executor (tuples in, tuples out) to something quite different (each node needs to know a lot about whatever is underneath it in order to divide the data up). > process and both requests would have been granted. Parallelism is > generally not about making different things happen; it's about > spreading the stuff that would happen anyway across multiple backends, This point would be much stronger for declarative queries. When the user issues LOCK TABLE, they are getting their hands dirty with the implementation to some degree. If they really want to serialize, they can serialize around the lock of an object that they don't want to also read (in parallel). You're right that this would be surprising to users, but hopefully we can find a way to disable parallelism in that case, or at least issue a WARNING that the user is probably doing the wrong thing. > "cluster pg_collate"... the only solution is to > move the parallel workers ahead of the AccessExclusiveLock request, > but the deadlock detector won't do that unless it knows about the > relationship among the parallel workers. As discussed later in this thread, I also consider failure a solution ;-) So we don't need to move anything. If we have to tell our users that exclusive-locking a catalog can cause deadlocks for parallel query, I think they will find that reasonable. The same admin could turn parallelism down to one for non-superusers before they start doing that, and restore it later. > patch that does parallel anything has got to worry about what > undetected deadlocks might result, and then argue about which of them > are obscure enough that we shouldn't care and which are not. That > doesn't sound like a fun way to spend my time. We're in agreement that deadlocks can't go undetected. > But, really, the first case is the one I'm more worried about: a bunch > of parallel backends all grab the same locks, but slightly separated > in time. A strong locker joins the queue midway through and we're > hosed. I agree the situation sounds bad in the abstract, but none of the examples seem very compelling. Who locks random catalogs exclusively? And if they do, why is it the lock manager's job to try to reconcile the resulting mess without canceling someone? (OK, don't answer that. It obviously is the lock manager's job to some degree, but we can only demand so much.) > Even if you think that all of these particular issues are somehow > avoidable or that they don't matter, I think it would be unwise to bet > on them being the only issues. The lock manager looks like it's old, > ill-maintained code, but that's in no small part because it solved the > problem it aims at so thoroughly that few people have felt the need to > modify it very much in the last decade. I think that's part of why > there's so much skepticism on this thread - we don't think about the > problems that the lock manager solves as being important problems not > because they truly aren't important, but because they are so > thoroughly solved. Then every once in a while we complain about the > performance. Guilty as charged. I won't forever oppose changes to the lock manager, but I resist making them a precondition of parallelism (aside from making deadlock detection work, of course). Users taking out exclusive locks recklessly can result in deadlocks today, and will result in more when we do parallelism (though perhaps in more surprising situations). I remember when doing Exclusion Constraints, I went to a lot of effort to make deadlocks impossible, and was quite proud. When Tom saw it, he told me not to bother, and to do it the simple way instead, because deadlocks can happen even with UNIQUE constraints (which I didn't even know). We should use the same strategy here. When we see deadlocks becoming a problem for any reasonable workload, we make a series of tweaks (perhaps some to the lock manager itself) to reduce them. Regards,Jeff Davis
pgsql-hackers by date: