Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
| From | Greg Nancarrow |
|---|---|
| Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
| Date | |
| Msg-id | CAJcOf-c2TrQFdhBgHMLs7HmzG1iLAUc3vnDmzYkmxek8Zjhfjg@mail.gmail.com Whole thread |
| In response to | Re: Parallel INSERT (INTO ... SELECT ...) (Amit Langote <amitlangote09@gmail.com>) |
| Responses |
Re: Parallel INSERT (INTO ... SELECT ...)
|
| List | pgsql-hackers |
On Wed, Feb 17, 2021 at 12:19 AM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Mon, Feb 15, 2021 at 4:39 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > On Sat, Feb 13, 2021 at 12:17 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > > On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > > Actually, I tried adding the following in the loop that checks the
> > > > parallel-safety of each partition and it seemed to work:
> > > >
> > > > glob->relationOids =
> > > > lappend_oid(glob->relationOids, pdesc->oids[i]);
> > > >
> > > > Can you confirm, is that what you were referring to?
> > >
> > > Right. I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
> > >
> > > Although it gets the job done, I'm not sure if manipulating
> > > relationOids from max_parallel_hazard() or its subroutines is okay,
> > > but I will let the committer decide that. As I mentioned above, the
> > > person who designed this decided for some reason that it is
> > > extract_query_dependencies()'s job to populate
> > > PlannerGlobal.relationOids/invalItems.
> >
> > Yes, it doesn't really seem right doing it within max_parallel_hazard().
> > I tried doing it in extract_query_dependencies() instead - see
> > attached patch - and it seems to work, but I'm not sure if there might
> > be any unintended side-effects.
>
> One issue I see with the patch is that it fails to consider
> multi-level partitioning, because it's looking up partitions only in
> the target table's PartitionDesc and no other.
>
> @@ -3060,8 +3066,36 @@ extract_query_dependencies_walker(Node *node,
> PlannerInfo *context)
> RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
>
> if (rte->rtekind == RTE_RELATION)
> - context->glob->relationOids =
> - lappend_oid(context->glob->relationOids, rte->relid);
> + {
> + PlannerGlobal *glob;
> +
> + glob = context->glob;
> + glob->relationOids =
> + lappend_oid(glob->relationOids, rte->relid);
> + if (query->commandType == CMD_INSERT &&
> + rte->relkind == RELKIND_PARTITIONED_TABLE)
>
> The RTE whose relkind is being checked here may not be the INSERT
> target relation's RTE, even though that's perhaps always true today.
> So, I suggest to pull the new block out of the loop over rtable and
> perform its deeds on the result RTE explicitly fetched using
> rt_fetch(), preferably using a separate recursive function. I'm
> thinking something like the attached revised version.
>
>
Thanks. Yes, I'd forgotten about the fact a partition may itself be
partitioned, so it needs to be recursive (like in the parallel-safety
checks).
Your revised version seems OK, though I do have a concern:
Is the use of "table_close(rel, NoLock)'' intentional? That will keep
the lock (lockmode) until end-of-transaction.
Regards,
Greg Nancarrow
Fujitsu Australia
pgsql-hackers by date: