Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CA+HiwqHo57OWs1s3mYpq9uHx_Hw2MRtp_jzyVnKn0ySQdQoL1Q@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Mon, Feb 15, 2021 at 4:39 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Sat, Feb 13, 2021 at 12:17 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > Actually, I tried adding the following in the loop that checks the
> > > parallel-safety of each partition and it seemed to work:
> > >
> > >             glob->relationOids =
> > >                     lappend_oid(glob->relationOids, pdesc->oids[i]);
> > >
> > > Can you confirm, is that what you were referring to?
> >
> > Right.  I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
> >
> > Although it gets the job done, I'm not sure if manipulating
> > relationOids from max_parallel_hazard() or its subroutines is okay,
> > but I will let the committer decide that.  As I mentioned above, the
> > person who designed this decided for some reason that it is
> > extract_query_dependencies()'s job to populate
> > PlannerGlobal.relationOids/invalItems.
>
> Yes, it doesn't really seem right doing it within max_parallel_hazard().
> I tried doing it in extract_query_dependencies() instead - see
> attached patch - and it seems to work, but I'm not sure if there might
> be any unintended side-effects.

One issue I see with the patch is that it fails to consider
multi-level partitioning, because it's looking up partitions only in
the target table's PartitionDesc and no other.

@@ -3060,8 +3066,36 @@ extract_query_dependencies_walker(Node *node,
PlannerInfo *context)
            RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);

            if (rte->rtekind == RTE_RELATION)
-               context->glob->relationOids =
-                   lappend_oid(context->glob->relationOids, rte->relid);
+           {
+               PlannerGlobal   *glob;
+
+               glob = context->glob;
+               glob->relationOids =
+                   lappend_oid(glob->relationOids, rte->relid);
+               if (query->commandType == CMD_INSERT &&
+                                   rte->relkind == RELKIND_PARTITIONED_TABLE)

The RTE whose relkind is being checked here may not be the INSERT
target relation's RTE, even though that's perhaps always true today.
So, I suggest to pull the new block out of the loop over rtable and
perform its deeds on the result RTE explicitly fetched using
rt_fetch(), preferably using a separate recursive function.  I'm
thinking something like the attached revised version.



--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: 64-bit XIDs in deleted nbtree pages
Next
From: Amit Langote
Date:
Subject: Re: making update/delete of inheritance trees scale better