Re: Parallel sec scan in plpgsql - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel sec scan in plpgsql
Date
Msg-id CAA4eK1La719C5oGH0nD75EqZSRzRs50zKZer4KTfO1O+D94oiQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel sec scan in plpgsql  (Alex Ignatov <a.ignatov@postgrespro.ru>)
Responses Re: Parallel sec scan in plpgsql  (Robert Haas <robertmhaas@gmail.com>)
Re: Parallel sec scan in plpgsql  (Alex Ignatov <a.ignatov@postgrespro.ru>)
List pgsql-hackers
On Fri, Sep 16, 2016 at 8:48 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
>
> On 16.09.2016 16:50, Amit Kapila wrote:
>>
>>
>> Can you try by setting force_parallel_mode = off;?  I think it is
>> sending the whole function execution to worker due to
>> force_parallel_mode.
>>
>>
>
> No changes:
>

Okay, it just skipped from my mind that we don't support parallel
queries for SQL statement execution (or statements executed via
exec_stmt_execsql) from plpgsql.  For detailed explanation of why that
is not feasible you can refer one of my earlier e-mails [1] on similar
topic.  I think if we can somehow get the results via Perform
statement, then it could be possible to use parallelism via plpgsql.

However, you can use it via SQL functions, an example is below:

set min_parallel_relation_size =0;
set parallel_tuple_cost=0;
set parallel_setup_cost=0;

Load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;
set auto_explain.log_nested_statements = true;

create table test_plpgsql(c1 int, c2 char(1000));
insert into test_plpgsql values(generate_series(1,100000),'aaa');

create or replace function parallel_test_set_sql() returns
setof bigint as $$
select count(*) from test_plpgsql;
$$language sql PARALLEL SAFE STRICT STABLE;

Then execute function as: select * from parallel_test_set_sql();  You
can see below plan if auto_explain module is loaded.
       Finalize Aggregate  (cost=14806.85..14806.86 rows=1 width=8) (actual tim
e=1094.966..1094.967 rows=1 loops=1)         ->  Gather  (cost=14806.83..14806.84 rows=2 width=8) (actual time=472.
216..1094.943 rows=3 loops=1)               Workers Planned: 2               Workers Launched: 2               ->
PartialAggregate  (cost=14806.83..14806.84 rows=1 width=8)
 
(actual time=177.867..177.868 rows=1 loops=3)                     ->  Parallel Seq Scan on test_plpgsql
(cost=0.00..14702.6
7 rows=41667 width=0) (actual time=0.384..142.565 rows=33333 loops=3)
CONTEXT:  SQL function "parallel_test_set_sql" statement 1
LOG:  duration: 2965.040 ms  plan:       Query Text: select * from parallel_test_set_sql();       Function Scan on
parallel_test_set_sql (cost=0.25..10.25 rows=1000 widt
 
h=8) (actual time=2538.620..2776.955 rows=1 loops=1)


In general, I think we should support the cases as required (or
written) by you from plpgsql or sql functions.  We need more work to
support such cases. There are probably two ways of supporting such
cases, we can build some intelligence in plpgsql execution such that
it can recognise such queries and allow to use parallelism or we need
to think of enabling parallelism for cases where we don't run the plan
to completion.  Most of the use cases from plpgsql or sql function
fall into later category as they don't generally run the plan to
completion.


[1] - https://www.postgresql.org/message-id/CAA4eK1K8kaO_jRk42-o2rmhSRbKV-3mR%2BiNVcONLdbcSXW5TfQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: WIP: About CMake v2
Next
From: Amit Kapila
Date:
Subject: Re: Speed up Clog Access by increasing CLOG buffers