Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Greg Nancarrow |
---|---|
Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | CAJcOf-excOA7qDEDeQJygXy7211dAHkiFBFUqTETqcOvsph9dg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel INSERT (INTO ... SELECT ...) (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Parallel INSERT (INTO ... SELECT ...)
|
List | pgsql-hackers |
On Fri, Oct 9, 2020 at 8:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > 0001-InsertParallelSelect > 1. > ParallelContext *pcxt; > > + /* > + * We need to avoid an attempt on INSERT to assign a > + * FullTransactionId whilst in parallel mode (which is in > + * effect due to the underlying parallel query) - so the > + * FullTransactionId is assigned here. Parallel mode must > + * be temporarily escaped in order for this to be possible. > + * The FullTransactionId will be included in the transaction > + * state that is serialized in the parallel DSM. > + */ > + if (estate->es_plannedstmt->commandType == CMD_INSERT) > + { > + Assert(IsInParallelMode()); > + ExitParallelMode(); > + GetCurrentFullTransactionId(); > + EnterParallelMode(); > + } > + > > This looks like a hack to me. I think you are doing this to avoid the > parallel mode checks in GetNewTransactionId(), right? Yes, agreed, is a hack to avoid that (mind you, it's not exactly great that ExecutePlan() sets parallel-mode for the entire plan execution). Also, did not expect that to necessarily remain in a final patch. >If so, I have > already mentioned above [1] that we can change it so that we disallow > assigning xids for parallel workers only. The same is true for the > check in ExecGatherMerge. Do you see any problem with that suggestion? > No, should be OK I guess, but will update and test to be sure. > 2. > @@ -337,7 +337,7 @@ standard_planner(Query *parse, const char > *query_string, int cursorOptions, > */ > if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && > IsUnderPostmaster && > - parse->commandType == CMD_SELECT && > + (parse->commandType == CMD_SELECT || parse->commandType == CMD_INSERT) && > !parse->hasModifyingCTE && > max_parallel_workers_per_gather > 0 && > !IsParallelWorker()) > > I think the comments above this need to be updated especially the part > where we says:"Note that we do allow CREATE TABLE AS, SELECT INTO, and > CREATE MATERIALIZED VIEW to use parallel plans, but as of now, only > the leader backend writes into a completely new table.". Don't we need > to include Insert also? Yes, Insert needs to be mentioned somewhere there. > > 3. > @@ -371,6 +371,7 @@ standard_planner(Query *parse, const char > *query_string, int cursorOptions, > * parallel-unsafe, or else the query planner itself has a bug. > */ > glob->parallelModeNeeded = glob->parallelModeOK && > + (parse->commandType == CMD_SELECT) && > (force_parallel_mode != FORCE_PARALLEL_OFF); > > Why do you need this change? The comments above this code should be > updated to reflect this change. I think for the same reason the below > code seems to be modified but I don't understand the reason for the > below change as well, also it is better to update the comments for > this as well. > OK, I will update the comments for this. Basically, up to now, the "force_parallel_mode" has only ever operated on a SELECT. But since we are now allowing CMD_INSERT to be assessed for parallel mode too, we need to prevent the force_parallel_mode logic from sticking a Gather node over the top of arbitrary INSERTs and causing them to be run in parallel. Not all INSERTs are suitable for parallel operation, and also there are further considerations for parallel-safety for INSERTs compared to SELECT. INSERTs can also trigger UPDATEs. If we need to support force_parallel_mode for INSERT, more work will need to be done. > @@ -425,7 +426,7 @@ standard_planner(Query *parse, const char > *query_string, int cursorOptions, > * Optionally add a Gather node for testing purposes, provided this is > * actually a safe thing to do. > */ > - if (force_parallel_mode != FORCE_PARALLEL_OFF && top_plan->parallel_safe) > + if (force_parallel_mode != FORCE_PARALLEL_OFF && parse->commandType > == CMD_SELECT && top_plan->parallel_safe) > { > Gather *gather = makeNode(Gather); > > [1] - https://www.postgresql.org/message-id/CAA4eK1%2BE-pM0U6qw7EOF0yO0giTxdErxoJV9xTqN%2BLo9zdotFQ%40mail.gmail.com > Regards, Greg Nancarrow Fujitsu Australia
pgsql-hackers by date: