Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAJcOf-dox7x=oP4dXLki99v_jE14FM=HBwZwEfts6dy-7EMnxA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Andres Freund <andres@anarazel.de>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
Hi Andres,

On Thu, Sep 24, 2020 at 12:21 PM Andres Freund <andres@anarazel.de> wrote:
>
>I think it'd be good if you outlined what your approach is to make this
>safe.

Some prior work has already been done to establish the necessary
infrastructure to allow parallel INSERTs, in general, to be safe,
except for cases where new commandIds would be generated in the
parallel-worker code (such as inserts into a table having a foreign
key) - these cases need to be avoided.
See the following commits.

85f6b49 Allow relation extension lock to conflict among parallel group members
3ba59cc Allow page lock to conflict among parallel group members

The planner code updated by the patch avoids creating a Parallel
INSERT plan in the case of inserting into a table that has a foreign
key.


>> For cases where it can't be allowed (e.g. INSERT into a table with
>> foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
>> ...") it at least allows parallelism of the SELECT part.
>
>I think it'd be good to do this part separately and first, independent
>of whether the insert part can be parallelized.

OK then, I'll try to extract that as a separate patch.


>> Obviously I've had to update the planner and executor and
>> parallel-worker code to make this happen, hopefully not breaking too
>> many things along the way.
>
>Hm, it looks like you've removed a fair bit of checks, it's not clear to
>me why that's safe in each instance.

It should be safe for Parallel INSERT - but you are right, these are
brute force removals (for the purpose of a POC patch) that should be
tightened up wherever possible to disallow unsafe paths into that
code. Problem is, currently there's not a lot of context information
available to easily allow that, so some work needs to be done.


>> - Currently only for "INSERT INTO ... SELECT ...". To support "INSERT
>> INTO ... VALUES ..." would need additional Table AM functions for
>> dividing up the INSERT work amongst the workers (currently only exists
>> for scans).
>
>Hm, not entirely following. What precisely are you thinking of here?

All I was saying is that for SELECTs, the work done by each parallel
worker is effectively divided up by parallel-worker-related functions
in tableam.c and indexam.c, and no such technology currently exists
for dividing up work for the "INSERT ... VALUES" case.


>I doubt it's really worth adding parallelism support for INSERT
>... VALUES, the cost of spawning workers will almost always higher than
>the benefit.

You're probably right in doubting any benefit, but I wasn't entirely sure.


>> @@ -116,7 +117,7 @@ toast_save_datum(Relation rel, Datum value,
>>       TupleDesc       toasttupDesc;
>>       Datum           t_values[3];
>>       bool            t_isnull[3];
>> -     CommandId       mycid = GetCurrentCommandId(true);
>> +     CommandId       mycid = GetCurrentCommandId(!IsParallelWorker());
>>       struct varlena *result;
>>       struct varatt_external toast_pointer;
>>       union
>
>Hm? Why do we need this in the various places you have made this change?

It's because for Parallel INSERT, we're assigning the same command-id
to each worker up-front during worker initialization (the commandId
has been retrieved by the leader and passed through to each worker)
and "currentCommandIdUsed" has been set true. See the
AssignCommandIdForWorker() function in the patch.
If you see the code of GetCurrentCommandId(), you'll see it Assert
that it's not being run by a parallel worker if the parameter is true.
I didn't want to remove yet another check, without being able to know
the context of the caller, because only for Parallel INSERT do I know
that "currentCommandIdUsed was already true at the start of the
parallel operation". See the comment in that function. Anyway, that's
why I'm passing "false" to relevant GetCurrentCommandId() calls if
they're being run by a parallel (INSERT) worker.


>> @@ -822,19 +822,14 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
>>                               isdead = false;
>>                               break;
>>                       case HEAPTUPLE_INSERT_IN_PROGRESS:
>> -
>>                               /*
>>                                * Since we hold exclusive lock on the relation, normally the
>>                                * only way to see this is if it was inserted earlier in our
>>                                * own transaction.  However, it can happen in system
>>                                * catalogs, since we tend to release write lock before >commit
>> -                              * there.  Give a warning if neither case applies; but in any
>> -                              * case we had better copy it.
>> +                              * there. In any case we had better copy it.
>>                                */
>> -                             if (!is_system_catalog &&
>> -                                     !TransactionIdIsCurrentTransactionId>(HeapTupleHeaderGetXmin(tuple->t_data)))
>> -                                     elog(WARNING, "concurrent insert in progress within >table \"%s\"",
>> -                                              RelationGetRelationName(OldHeap));
>> +
>>                               /* treat as live */
>>                               isdead = false;
>>                               break;
>> @@ -1434,16 +1429,11 @@ heapam_index_build_range_scan(Relation heapRelation,
>>                                        * the only way to see this is if it was inserted >earlier
>>                                        * in our own transaction.  However, it can happen in
>>                                        * system catalogs, since we tend to release write >lock
>> -                                      * before commit there.  Give a warning if neither >case
>> -                                      * applies.
>> +                                      * before commit there.
>>                                        */
>>                                       xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
>>                                       if (!TransactionIdIsCurrentTransactionId(xwait))
>>                                       {
>> -                                             if (!is_system_catalog)
>> -                                                     elog(WARNING, "concurrent insert in >progress within table
\"%s\"",
>> -                                                              RelationGetRelationName>(heapRelation));
>> -
>>                                               /*
>>                                                * If we are performing uniqueness checks, >>indexing
>>                                                * such a tuple could lead to a bogus >uniqueness
>
>Huh, I don't think this should be necessary?

Yes, I think you're right, I perhaps got carried away removing checks
on concurrent inserts. I will revert those changes.


>> diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
>> index a4944fa..9d3f100 100644
>> --- a/src/backend/access/transam/varsup.c
>> +++ b/src/backend/access/transam/varsup.c
>> @@ -53,13 +53,6 @@ GetNewTransactionId(bool isSubXact)
>>       TransactionId xid;
>>
>>       /*
>> -      * Workers synchronize transaction state at the beginning of each parallel
>> -      * operation, so we can't account for new XIDs after that point.
>> -      */
>> -     if (IsInParallelMode())
>> -             elog(ERROR, "cannot assign TransactionIds during a parallel operation");
>> -
>> -     /*
>>        * During bootstrap initialization, we return the special bootstrap
>>        * transaction id.
>>        */
>
>Same thing, this code cannot just be allowed to be reachable. What
>prevents you from assigning two different xids from different workers
>etc?

At least in the case of Parallel INSERT, the leader for the Parallel
INSERT gets a new xid (GetCurrentFullTransactionId) and it is passed
through and assigned to each of the workers during their
initialization (so they are assigned the same xid).


Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2
Next
From: btnakamichin
Date:
Subject: Feature improvement of tab completion for DEALLOCATE