Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAA4eK1+E-pM0U6qw7EOF0yO0giTxdErxoJV9xTqN+Lo9zdotFQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <amit.kapila16@g
>
> As it turns out, I think I have solved the commandId issue (and almost
> the xid issue) by realising that both the xid and cid are ALREADY
> being included as part of the serialized transaction state in the
> Parallel DSM. So actually I don't believe that there is any need for
> separately passing them in the DSM, and having to use those
> AssignXXXXForWorker() functions in the worker code - not even in the
> Parallel Copy case (? - need to check). GetCurrentCommandId(true) and
> GetFullTransactionId() need to be called prior to Parallel DSM
> initialization, so they are included in the serialized transaction
> state.
> I just needed to add a function to set currentCommandIdUsed=true in
> the worker initialization (for INSERT case) and make a small tweak to
> the Assert in GetCurrentCommandId() to ensure that
> currentCommandIdUsed, in a parallel worker, never gets set to true
> when it is false. This is in line with the comment in that function,
> because we know that "currentCommandId was already true at the start
> of the parallel operation". With this in place, I don't need to change
> any of the original calls to GetCurrentCommandId(), so this addresses
> that issue raised by Andres.
>
> I am not sure yet how to get past the issue of the parallel mode being
> set at the top of ExecutePlan(). With that in place, it doesn't allow
> a xid to be acquired for the leader, without removing/changing that
> parallel-mode check in GetNewTransactionId().
>

I think now there is no fundamental problem in allocating xid in the
leader and then sharing it with workers who can use it to perform the
insert. So we can probably tweak that check so that it is true for
only parallel workers.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: a potential size overflow issue
Next
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)