On Sat, Jan 16, 2021 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Jan 15, 2021 at 6:45 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Fri, Jan 15, 2021 at 9:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > We want to do this for Inserts where only Select can be parallel and
> > > Inserts will always be done by the leader backend. This is actually
> > > the case we first want to implement.
> >
> > Sorry, I haven't looked at the linked threads and the latest patches
> > there closely enough yet, so I may be misreading this, but if the
> > inserts will always be done by the leader backend as you say, then why
> > does the planner need to be checking the parallel safety of the
> > *target* table's expressions?
> >
>
> The reason is that once we enter parallel-mode we can't allow
> parallel-unsafe things (like allocation of new CIDs, XIDs, etc.). We
> enter the parallel-mode at the beginning of the statement execution,
> see ExecutePlan(). So, the Insert will be performed in parallel-mode
> even though it happens in the leader backend. It is not possible that
> we finish getting all the tuples from the gather node first and then
> start inserting. Even, if we somehow find something to make this work
> anyway the checks being discussed will be required to make inserts
> parallel (where inserts will be performed by workers) which is
> actually the next patch in the thread I mentioned in the previous
> email.
>
> Does this answer your question?
Yes, thanks for the explanation. I kind of figured that doing the
insert part itself in parallel using workers would be a part of the
end goal of this work, although that didn't come across immediately.
It's a bit unfortunate that the parallel safety check of the
individual partitions cannot be deferred until it's known that a given
partition will be affected by the command at all. Will we need
fundamental changes to how parallel query works to make that possible?
If so, have such options been considered in these projects? If such
changes are not possible in the short term, like for v14, we should at
least try to make sure that the eager checking of all partitions is
only performed if using parallelism is possible at all.
I will try to take a look at the patches themselves to see if there's
something I know that will help.
--
Amit Langote
EDB: http://www.enterprisedb.com