Re: parallel mode and parallel contexts - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: parallel mode and parallel contexts |
Date | |
Msg-id | CA+TgmoYn4zoyUN-Yb-9dVFFAu5mNgFXcifCXZ=8-Qs1Zqy2y2Q@mail.gmail.com Whole thread Raw |
In response to | Re: parallel mode and parallel contexts (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: parallel mode and parallel contexts
|
List | pgsql-hackers |
On Wed, Mar 18, 2015 at 7:10 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-03-18 12:02:07 -0400, Robert Haas wrote: >> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c >> index cb6f8a3..173f0ba 100644 >> --- a/src/backend/access/heap/heapam.c >> +++ b/src/backend/access/heap/heapam.c >> @@ -2234,6 +2234,17 @@ static HeapTuple >> heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid, >> CommandId cid, int options) >> { >> + /* >> + * For now, parallel operations are required to be strictly read-only. >> + * Unlike heap_update() and heap_delete(), an insert should never create >> + * a combo CID, so it might be possible to relax this restrction, but >> + * not without more thought and testing. >> + */ >> + if (IsInParallelMode()) >> + ereport(ERROR, >> + (errcode(ERRCODE_INVALID_TRANSACTION_STATE), >> + errmsg("cannot insert tuples during a parallel operation"))); >> + > > Minor nitpick: Should we perhaps move the ereport to a separate > function? Akin to PreventTransactionChain()? Seems a bit nicer to not > have the same message everywhere. Well, it's not actually the same message. They're all a bit different. Or mostly all of them. And the variable part is not a command name, as in the PreventTransactionChain() case, so it would affect translatability if we did that. I don't actually think this is a big deal. >> +void >> +DestroyParallelContext(ParallelContext *pcxt) >> +{ >> + int i; >> + >> + /* >> + * Be careful about order of operations here! We remove the parallel >> + * context from the list before we do anything else; otherwise, if an >> + * error occurs during a subsequent step, we might try to nuke it again >> + * from AtEOXact_Parallel or AtEOSubXact_Parallel. >> + */ >> + dlist_delete(&pcxt->node); > > Hm. I'm wondering about this. What if we actually fail below? Like in > dsm_detach() or it's callbacks? Then we'll enter abort again at some > point (during proc_exit at the latest) but will not wait for the child > workers. Right? Right. It's actually pretty hard to recover when things that get called in the abort path fail, which is why dsm_detach() and shm_mq_detach_callback() try pretty hard to only do things that are no-fail. For example, failing to detach a dsm gives a WARNING, not an ERROR. Now, I did make some attempt in previous patches to ensure that proc_exit() has some ability to recover even if a callback fails (cf 001a573a2011d605f2a6e10aee9996de8581d099) but I'm not too sure how useful that's going to be in practice. I'm open to ideas on how to improve this. >> +void >> +AtEOXact_Parallel(bool isCommit) >> +{ >> + while (!dlist_is_empty(&pcxt_list)) >> + { >> + ParallelContext *pcxt; >> + >> + pcxt = dlist_head_element(ParallelContext, node, &pcxt_list); >> + if (isCommit) >> + elog(WARNING, "leaked parallel context"); >> + DestroyParallelContext(pcxt); >> + } >> +} > > Is there any reason to treat the isCommit case as a warning? To me that > sounds like a pretty much guaranteed programming error. If your're going > to argue that a couple resource leakage warnings do similarly: I don't > think that counts for that much. For one e.g. a leaked tupdesc has much > less consequences, for another IIRC those warnings were introduced > after the fact. Yeah, I'm going to argue that. A leaked parallel context is pretty harmless if there are no workers associated with it. If there are, it's less harmless, of course, but it doesn't seem to me that making that an ERROR buys us much. I mean, the transaction is going to end either way, and a message is going to get printed either way, and it's a bug either way, so whatever. >> + * When running as a parallel worker, we place only a single >> + * TransactionStateData is placed on the parallel worker's >> + * state stack, > > 'we place .. is placed' Will fix in next update. >> + if (s->parallelModeLevel == 0) >> + CheckForRetainedParallelLocks(); >> +} > > Hm. Is it actually ok for nested parallel mode to retain locks on their > own? Won't that pose a problem? Or did you do it that way just because > we don't have more state? If so that deserves a comment explaining that > htat's the case and why that's acceptable. The only time it's really a problem to retain locks is if you are doing WaitForParallelWorkersToFinish(). This is pretty much just a belt-and-suspenders check to make it easier to notice that you've goofed. But, sure, removing the if part makes sense. I'll do that in the next update. >> @@ -1769,6 +1904,9 @@ CommitTransaction(void) >> { >> TransactionState s = CurrentTransactionState; >> TransactionId latestXid; >> + bool parallel; >> + >> + parallel = (s->blockState == TBLOCK_PARALLEL_INPROGRESS); > >> + /* If we might have parallel workers, clean them up now. */ >> + if (IsInParallelMode()) >> + { >> + CheckForRetainedParallelLocks(); >> + AtEOXact_Parallel(true); >> + s->parallelModeLevel = 0; >> + } > > 'parallel' looks strange when we're also, rightly so, do stuff like > checking IsInParallelMode(). How about naming it is_parallel_worker or > something? Yeah, makes sense. Will do that, too. > Sorry, ran out of concentration here. It's been a long day. Thanks for the review so far. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: