Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism
Date
Msg-id CAA4eK1+LjL6vayGHWvBnYGuGVc7NwoFvVsPjqYBb2ULnFbRJoA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Jun 3, 2017 at 9:34 PM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2017-06-03 17:40:08 +0530, Amit Kapila wrote:
>> The standard_planner check is sufficient to not generate parallel
>> plans for such statements, but it won't prevent if such commands
>> (which shouldn't be executed by parallel workers) are present in
>> functions.  Consider a hypothetical case as below:
>>
>> 1.  Create a parallel safe function containing Copy commands.
>> create or replace function parallel_copy(a integer) returns integer
>> as $$
>> begin
>> Copy (select * from t1 where c1 < 2) to 'e:\\f1';
>>         return a;
>> end;
>> $$ language plpgsql Parallel Safe;
>>
>> 2. Now use this in some command which can be executed in parallel.
>> explain analyze select * from t1 where c1 < parallel_copy(10);
>>
>> This can allow Copy command to be executed by parallel workers if we
>> don't have sufficient safeguards.
>
> Yes.  But I'm unclear what does that have to do with the change
> discussed in this thread?
>

It is not related to the change you have proposed.  It just occurred
to me while reading the code in the area where you have proposed to
change, so mentioned here.  It might have been better to report it in
a separate thread.

>  The pg_plan_query in copy.c setting
> CURSOR_OPT_PARALLEL_OK doesn't meaningfully change the risk of this
> happening in one way or the other.
>
>> We already tried to prohibit it in
>> plpgsql like in function _SPI_execute_plan(), we call
>> PreventCommandIfParallelMode.  However, inspite of that, we have
>> safeguards in lower level calls, so that if the code flow reaches such
>> commands in parallel mode, we error out.  We have a similar check in
>> Copy From code flow  ( PreventCommandIfParallelMode("COPY FROM");) as
>> well, but I think we should have it in Copy To flow as well.
>
> Why?  What is it effectively preventing?  Multiple workers copying to
> the same file?
>

Yeah.  Also, the query (in Copy To command) can be a write a statement
as well.  I thought giving an error from COPY TO flow would be
appropriate for it.

>  Any such function would have the same risk for separate
> sessions.
>

You are right, but not sure if it makes sense to allow any form of
write statement in parallel workers without doing more analysis.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] ALTER INDEX .. SET STATISTICS ... behaviour
Next
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] proposal psql \gdesc