Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism
Date
Msg-id CAA4eK1+8VA32nNdokuAYv2=8ei_NUhpZ0WyV_N_sAyjAkAexAg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Jun 1, 2017 at 10:16 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-06-01 21:37:56 +0530, Amit Kapila wrote:
>> On Thu, Jun 1, 2017 at 9:34 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2017-06-01 21:23:04 +0530, Amit Kapila wrote:
>> >> On a related note, I think it might be better to have an
>> >> IsInParallelMode() check in this case as we have at other places.
>> >> This is to ensure that if this command is invoked via plpgsql function
>> >> and that function runs is the parallel mode, it will act as a
>> >> safeguard.
>> >
>> > Hm? Which other places do it that way?  Isn't standard_planner()
>> > centralizing such a check?
>> >
>>
>> heap_insert->heap_prepare_insert, heap_update, heap_delete, etc.
>
> Those aren't comparable, they're not invoking the planner - and all the
> places that set PARALLEL_OK don't check for it.  The relevant check for
> planning is in standard_planner().
>

The standard_planner check is sufficient to not generate parallel
plans for such statements, but it won't prevent if such commands
(which shouldn't be executed by parallel workers) are present in
functions.  Consider a hypothetical case as below:

1.  Create a parallel safe function containing Copy commands.
create or replace function parallel_copy(a integer) returns integer
as $$
begin
Copy (select * from t1 where c1 < 2) to 'e:\\f1';       return a;
end;
$$ language plpgsql Parallel Safe;

2. Now use this in some command which can be executed in parallel.
explain analyze select * from t1 where c1 < parallel_copy(10);

This can allow Copy command to be executed by parallel workers if we
don't have sufficient safeguards.  We already tried to prohibit it in
plpgsql like in function _SPI_execute_plan(), we call
PreventCommandIfParallelMode.  However, inspite of that, we have
safeguards in lower level calls, so that if the code flow reaches such
commands in parallel mode, we error out.  We have a similar check in
Copy From code flow  ( PreventCommandIfParallelMode("COPY FROM");) as
well, but I think we should have it in Copy To flow as well.

I agree that at first place user shouldn't mark such functions as
parallel safe, but having such safeguards can prevent us from problems
where users have incorrectly marked some functions as parallel safe.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Adding support for Default partition in partitioning
Next
From: Amit Kapila
Date:
Subject: Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidaeis *still* broken)