Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAFiTN-sxxr4k2nvxjqMNxL-kjSgijzH3+xZZuPv_O4-1pfaruw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
On Wed, Dec 9, 2020 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 9, 2020 at 2:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Dec 9, 2020 at 10:11 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Wed, Dec 9, 2020 at 1:35 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Most of the code present in
> > > > v9-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch is
> > > > applicable for parallel copy patch also. The patch in this thread
> > > > handles the check for PROPARALLEL_UNSAFE, we could slightly make it
> > > > generic by handling like the comments below, that way this parallel
> > > > safety checks can be used based on the value set in
> > > > max_parallel_hazard_context. There is nothing wrong with the changes,
> > > > I'm providing these comments so that this patch can be generalized for
> > > > parallel checks and the same can also be used by parallel copy.
> > >
> > > Hi Vignesh,
> > >
> > > You are absolutely right in pointing that out, the code was taking
> > > short-cuts knowing that for Parallel Insert,
> > > "max_parallel_hazard_context.max_interesting" had been set to
> > > PROPARALLEL_UNSAFE, which doesn't allow that code to be generically
> > > re-used by other callers.
> > >
> > > I've attached a new set of patches that includes your suggested improvements.
> >
> > I was going through v10-0001 patch where we are parallelizing only the
> > select part.
> >
> > + /*
> > + * UPDATE is not currently supported in parallel-mode, so prohibit
> > + * INSERT...ON CONFLICT...DO UPDATE...
> > + */
> > + if (parse->onConflict != NULL && parse->onConflict->action ==
> > ONCONFLICT_UPDATE)
> > + return PROPARALLEL_UNSAFE;
> >
> > I understand that we can now allow updates from the worker, but what
> > is the problem if we allow the parallel select even if there is an
> > update in the leader?
> >
>
> I think we can't allow update even in leader without having a
> mechanism for a shared combocid table. Right now, we share the
> ComboCids at the beginning of the parallel query and then never change
> it during the parallel query but if we allow updates in the leader
> backend which can generate a combocid then we need a mechanism to
> propagate that change. Does this make sense?
>

Okay, got it.  Basically, ONCONFLICT_UPDATE might run inside some
transaction block and there is a possibility that update may try to
update the same tuple is previously inserted by the same transaction
and in that case, it will generate the combo cid.  Thanks for
clarifying.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Parallel copy
Next
From: Gilles Darold
Date:
Subject: Re: MultiXact\SLRU buffers configuration