Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAA4eK1Lefef0BjsZ5ZqhjoQZXVnwYkS0qdaup+Sv983TkT_8qg@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Fri, Oct 9, 2020 at 5:54 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Oct 9, 2020 at 8:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > + /*
> > + * We need to avoid an attempt on INSERT to assign a
> > + * FullTransactionId whilst in parallel mode (which is in
> > + * effect due to the underlying parallel query) - so the
> > + * FullTransactionId is assigned here. Parallel mode must
> > + * be temporarily escaped in order for this to be possible.
> > + * The FullTransactionId will be included in the transaction
> > + * state that is serialized in the parallel DSM.
> > + */
> > + if (estate->es_plannedstmt->commandType == CMD_INSERT)
> > + {
> > + Assert(IsInParallelMode());
> > + ExitParallelMode();
> > + GetCurrentFullTransactionId();
> > + EnterParallelMode();
> > + }
> > +
> >
> > This looks like a hack to me. I think you are doing this to avoid the
> > parallel mode checks in GetNewTransactionId(), right? If so, I have
> > already mentioned above [1] that we can change it so that we disallow
> > assigning xids for parallel workers only. The same is true for the
> > check in ExecGatherMerge. Do you see any problem with that suggestion?
> >
>
> Actually, there is a problem.
> If I remove that "hack", and change the code in GetNewTransactionId()
> to disallow xid assignment for parallel workers only, then there is
> also similar code in AssignTransactionId() which gets called.
>

I don't think workers need to call AssignTransactionId(), before that
the transactionid passed from leader should be set in
CurrentTransactionState. Why
GetCurrentTransactionId()/GetCurrentFullTransactionId(void) needs to
call AssignTransactionId() when called from worker?

> GetCurrentFullTransactionId() must be called in the leader, somewhere
> (and will be included in the transaction state that is serialized in
> the parallel DSM).
>

Yes, it should have done in the leader and then it should have been
set in the workers via StartParallelWorkerTransaction before we do any
actual operation. If that happens then GetCurrentTransactionId() won't
need to call AssignTransactionId().

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Next
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)