Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAA4eK1LgUnk9X5yvYnwwoueijB-uuGPECVEAHVPmxKLoHW+xqQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sat, Sep 26, 2020 at 11:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > Again, there's a fundamental difference in the Parallel Insert case.
> > Right at the top of ExecutePlan it calls EnterParallelMode().
> > For ParallelCopy(), there is no such problem. EnterParallelMode() is
> > only called just before ParallelCopyMain() is called. So it can easily
> > acquire the xid before this, because then parallel mode is not set.
> >
> > As it turns out, I think I have solved the commandId issue (and almost
> > the xid issue) by realising that both the xid and cid are ALREADY
> > being included as part of the serialized transaction state in the
> > Parallel DSM. So actually I don't believe that there is any need for
> > separately passing them in the DSM, and having to use those
> > AssignXXXXForWorker() functions in the worker code - not even in the
> > Parallel Copy case (? - need to check).
> >
>
> Thanks Gred for the detailed points.
>
> I further checked on full txn id and command id. Yes, these are
> getting passed to workers  via InitializeParallelDSM() ->
> SerializeTransactionState(). I tried to summarize what we need to do
> in case of parallel inserts in general i.e. parallel COPY, parallel
> inserts in INSERT INTO and parallel inserts in CTAS.
>
> In the leader:
>     GetCurrentFullTransactionId()
>     GetCurrentCommandId(true)
>     EnterParallelMode();
>     InitializeParallelDSM() --> calls SerializeTransactionState()
> (both full txn id and command id are serialized into parallel DSM)
>

This won't be true for Parallel Insert patch as explained by Greg as
well because we enter-parallel-mode much before we assign xid.


-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
Next
From: Julien Rouhaud
Date:
Subject: Re: Dynamic gathering the values for seq_page_cost/xxx_cost