Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | CA+HiwqHJ3sp9hAZF1vM3F55=8kuDnMkESrTACN5A4TMw6JznUg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel INSERT (INTO ... SELECT ...) (Greg Nancarrow <gregn4422@gmail.com>) |
Responses |
Re: Parallel INSERT (INTO ... SELECT ...)
Re: Parallel INSERT (INTO ... SELECT ...) |
List | pgsql-hackers |
Greg, all Thanks a lot for your work on this. On Mon, Feb 8, 2021 at 3:53 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > Posting an updated set of patches. I've been looking at these patches, initially with an intention to review mainly any partitioning-related concerns, but have some general thoughts as well concerning mostly the patches 0001 and 0002. * I've seen review comments on this thread where I think it's been suggested that whatever max_parallel_hazard_for_modify() does had better have been integrated into max_parallel_hazard() such that there's no particular need for that function to exist. For example, the following: + /* + * UPDATE is not currently supported in parallel-mode, so prohibit + * INSERT...ON CONFLICT...DO UPDATE... + * In order to support update, even if only in the leader, some + * further work would need to be done. A mechanism would be needed + * for sharing combo-cids between leader and workers during + * parallel-mode, since for example, the leader might generate a + * combo-cid and it needs to be propagated to the workers. + */ + if (parse->onConflict != NULL && parse->onConflict->action == ONCONFLICT_UPDATE) + return PROPARALLEL_UNSAFE; could be placed in the following block in max_parallel_hazard(): /* * When we're first invoked on a completely unplanned tree, we must * recurse into subqueries so to as to locate parallel-unsafe constructs * anywhere in the tree. */ else if (IsA(node, Query)) { Query *query = (Query *) node; /* SELECT FOR UPDATE/SHARE must be treated as unsafe */ if (query->rowMarks != NULL) { context->max_hazard = PROPARALLEL_UNSAFE; return true; } Furthermore, the following: + rte = rt_fetch(parse->resultRelation, parse->rtable); + + /* + * The target table is already locked by the caller (this is done in the + * parse/analyze phase). + */ + rel = table_open(rte->relid, NoLock); + (void) rel_max_parallel_hazard_for_modify(rel, parse->commandType, &context); + table_close(rel, NoLock); can itself be wrapped in a function that's called from max_parallel_hazard() by adding a new block for RangeTblEntry nodes and passing QTW_EXAMINE_RTES_BEFORE to query_tree_walker(). That brings me to to this part of the hunk: + /* + * If there is no underlying SELECT, a parallel table-modification + * operation is not possible (nor desirable). + */ + hasSubQuery = false; + foreach(lc, parse->rtable) + { + rte = lfirst_node(RangeTblEntry, lc); + if (rte->rtekind == RTE_SUBQUERY) + { + hasSubQuery = true; + break; + } + } + if (!hasSubQuery) + return PROPARALLEL_UNSAFE; The justification for this given in: https://www.postgresql.org/message-id/CAJcOf-dF9ohqub_D805k57Y_AuDLeAQfvtaax9SpwjTSEVdiXg%40mail.gmail.com seems to be that the failure of a test case in partition-concurrent-attach isolation suite is prevented if finding no subquery RTEs in the query is flagged as parallel unsafe, which in turn stops max_parallel_hazard_modify() from locking partitions for safety checks in such cases. But it feels unprincipled to have this code to work around a specific test case that's failing. I'd rather edit the failing test case to disable parallel execution as Tsunakawa-san suggested. * Regarding function names: +static bool trigger_max_parallel_hazard_for_modify(TriggerDesc *trigdesc, + max_parallel_hazard_context *context); +static bool index_expr_max_parallel_hazard_for_modify(Relation rel, + max_parallel_hazard_context *context); +static bool domain_max_parallel_hazard_for_modify(Oid typid, max_parallel_hazard_context *context); +static bool rel_max_parallel_hazard_for_modify(Relation rel, + CmdType command_type, + max_parallel_hazard_context *context) IMO, it would be better to name these target_rel_trigger_max_parallel_hazard(), target_rel_index_max_parallel_hazard(), etc. rather than have _for_modify at the end of these names to better connote that they check the parallel safety of applying the modify operation to a given target relation. Also, put these prototypes just below that of max_parallel_hazard() to have related things close by. Attached please see v15_delta.diff showing the changes suggested above. * I suspect that the following is broken in light of concurrent attachment of partitions. + + /* Recursively check each partition ... */ + pdesc = RelationGetPartitionDesc(rel); I think we'd need to use CreatePartitionDirectory() and retrieve the PartitionDesc using PartitionDirectoryLookup(). Something we already do when opening partitions for SELECT planning. * I think that the concerns raised by Tsunakawa-san in: https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com regarding how this interacts with plancache.c deserve a look. Specifically, a plan that uses parallel insert may fail to be invalidated when partitions are altered directly (that is without altering their root parent). That would be because we are not adding partition OIDs to PlannerGlobal.invalItems despite making a plan that's based on checking their properties. See this (tested with all patches applied!): create table rp (a int) partition by range (a); create table rp1 partition of rp for values from (minvalue) to (0); create table rp2 partition of rp for values from (0) to (maxvalue); create table foo (a) as select generate_series(1, 1000000); prepare q as insert into rp select * from foo where a%2 = 0; explain execute q; QUERY PLAN ------------------------------------------------------------------------------- Gather (cost=1000.00..13041.54 rows=5642 width=0) Workers Planned: 2 -> Insert on rp (cost=0.00..11477.34 rows=0 width=0) -> Parallel Seq Scan on foo (cost=0.00..11477.34 rows=2351 width=4) Filter: ((a % 2) = 0) (5 rows) -- create a parallel unsafe trigger (that's actually marked so) directly on a partition create or replace function make_table () returns trigger language plpgsql as $$ begin create table bar(); return null; end; $$ parallel unsafe; create trigger ai_rp2 after insert on rp2 for each row execute function make_table();CREATE TRIGGER -- plan still parallel explain execute q; QUERY PLAN ------------------------------------------------------------------------------- Gather (cost=1000.00..13041.54 rows=5642 width=0) Workers Planned: 2 -> Insert on rp (cost=0.00..11477.34 rows=0 width=0) -> Parallel Seq Scan on foo (cost=0.00..11477.34 rows=2351 width=4) Filter: ((a % 2) = 0) (5 rows) -- and because it is execute q; ERROR: cannot start commands during a parallel operation CONTEXT: SQL statement "create table bar()" PL/pgSQL function make_table() line 1 at SQL statement -- OTOH, altering parent correctly discards the parallel plan create trigger ai_rp after insert on rp for each row execute function make_table(); explain execute q; QUERY PLAN ---------------------------------------------------------------- Insert on rp (cost=0.00..19425.00 rows=0 width=0) -> Seq Scan on foo (cost=0.00..19425.00 rows=5000 width=4) Filter: ((a % 2) = 0) (3 rows) It's fair to argue that it would rarely make sense to use PREPARE for bulk loads, but we need to tighten things up a bit here regardless. -- Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: