Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: group locking: incomplete patch, just for discussion |
Date | |
Msg-id | CA+Tgmoa7NNngUANO4eknJ3Me_gVdTWxEqnimPCcSHoeg4AiFLQ@mail.gmail.com Whole thread Raw |
In response to | Re: group locking: incomplete patch, just for discussion (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: group locking: incomplete patch, just for discussion
|
List | pgsql-hackers |
On Thu, Nov 20, 2014 at 3:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> Haven't I responded to this a few times already? > Not in a particularly convincing way at least. That's not a very helpful response. >> I see no way, even theoretically, that it can be sane for >> CheckTableNotInUse() to succeed in a parallel context. Ever. > > It's not exactly inconceivable that you'd want a parallel > CLUSTER/REINDEX/VACUUM or something similar. That'll require cooperating > backends, true, but it's still a pretty sane case. > > You haven't really made your case why you want to allow access to AEL > locked relations in a parallel context in the first place. Well, what the heck was I doing here? http://www.postgresql.org/message-id/CA+Tgmoa_+u-qCcFZ9zAVZFk+TGXbqQB1+0ktPZRGgPJ5PvrJVw@mail.gmail.com I've *repeatedly* point out that if you *don't* allow this, stuff breaks. And you haven't answered how you'd like to fix that. The only answer you've given so far, that I can see, is that maybe we can foresee all the locks that the child might take and not initiate parallelism if any of them conflict with locks we already hold. But that's not a reasonable approach; we have no reasonable way of predicting what locks the child will need, and even if we did, things can change between plan time and execution time. >> If the >> deadlock detector would kill the processes anyway, it doesn't matter, >> because CheckTableNotInUse() should do it first, so that we get a >> better error and without waiting for deadlock_timeout. So any >> scenario that's predicated on the assumption that CheckTableNotInUse() >> will succeed in a parallel context is 100% unconvincing to me as an >> argument for anything. > > Meh, there's plenty cases where CheckTableNotInUse() isn't used where we > still rely on locks actually working. And it's not just postgres code, > this *will* break user code. It would help if you'd provide some specific examples of the problems you foresee. In the absence of specificity, it's hard to tell come to any conclusions about whether your concerns are well-founded, or propose any way to overcome them. > Your argument essentially is that we don't actually need to lock objects > if the other side only reads. Yes, you add a condition or two ontop > (e.g. CheckTableNotInUse() fails), but that's not that much. I don't think that's my argument at all. What I'm arguing is that there's more than one reason for taking locks. Broadly, I think we take some locks to preserve transaction isolation guarantees and others to preserve data integrity. If you're adding or dropping a column, nobody else had better be looking at that relation, because they might get confused and crash the server or something. Backends, even cooperating parallel backends, won't be able to cope with the tuple descriptor changing under them. But once that operation is complete, there's no reason why the *next* statement in that transaction can't see the relation you just modified, even though it's still now exclusively locked. As long as we make sure that new parallel backends use our snapshot, they'll see the right metadata for that relation and can safely access it. We still can't let *other* people see that relation because the ALTER TABLE is an uncommitted change, but that's no argument against letting our own parallel workers look at it. > If we want > to do this we really need to forbid *any* modification to catalog/user > tables while parallel operations are ongoing. In the primary and worker > backends. I think that's going to end up being much more > problematic/restrictive than not allowing non-cooperative paralellism > after >= ShareUpdateExclusive. If we ever want to parallelize writing > operations we'll regret this quite a bit. I can't really follow how this follows from anything I said. > Breaking the locking system - and that's what you're doing by removing > the usual guarantees - seems just fundamentally wrong. Yes, I can't > name all the dangers that I think are out there. > > The 'lets grant all the current locks at start of parallel query' > approach seems quite a bit safer than always doing that during parallel > query, don't get me wrong. "Lets grant all the current locks at start of parallel query" is the proposal I'm focused on right now. Based on your arguments and those of others, I have long-since abandoned the original proposal of having locks within a group never conflict. It seemed like a good idea at the time, but it would put way too much burden on the parallel workers to coordinate their activity with each other, so it's dead. What I want to establish at this time is what types of problems, if any, you or others can foresee with "Lets grant all the current locks at start of parallel query" - because I sense considerable skepticism, but cannot put my finger on anything concretely wrong with it. > Btw, what are your thoughts about SERIALIZABLE and parallel query? > Afaics that'd be pretty hard to support? Hmm, I haven't thought about that. Since we never *wait* for SIREAD locks, and there's no deadlock detection to worry about, we might be able to finagle things so that the parallel backends just take the same SIREAD locks the parent would have taken, using the parent's PGPROC. But that could easily turn out to be unworkable, insufficient, or otherwise brain-damaged. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: