Thread: Explain says 8 workers planned, only 1 executed

Explain says 8 workers planned, only 1 executed

From

Alastair McKinley

Date:

21 March 2020, 17:25:31

Hi all,

I have a long running query that I have tweaked along with config (e.g. min_parallel_table_scan_size) to execute nicely and very fast in parallel which works as expected executed directly from psql client. The query is then embedded in a psql function like "return query select * from function_that_executes_the_query()".

I am checking the explain output (using execute explain $query) just before executing inside my function and it the plan is identical to what I would expect, planning 8 workers. However, this query actually only uses 1 worker and takes many times longer than when ran directly on the psql command line with the same server configuration parameters.

Why would the explain output be different from the executed plan? Is this a limitation of plpgsql functions? Is there any way to debug this further?

If it is meaningful during parallel execution I notice lots of "postgres: parallel worker" proceses in top and when executing from my function just a single "postgres: $user $db $host($pid) SELECT" processes.

Best regards,

Alastair

Re: Explain says 8 workers planned, only 1 executed

From

Adrian Klaver

Date:

21 March 2020, 17:38:43

On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
> 
> I have a long running query that I have tweaked along with config (e.g. 
> min_parallel_table_scan_size) to execute nicely and very fast in 
> parallel which works as expected executed directly from psql client.  
> The query is then embedded in a psql function like "return query select 
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

> 
> I am checking the explain output (using execute explain $query) just 
> before executing inside my function and it the plan is identical to what 
> I would expect, planning 8 workers.  However, this query actually only 
> uses 1 worker and takes many times longer than when ran directly on the 
> psql command line with the same server configuration parameters.
> 
> Why would the explain output be different from the executed plan? Is 
> this a limitation of plpgsql functions? Is there any way to debug this 
> further?
> 
> If it is meaningful during parallel execution I notice lots of 
> "postgres: parallel worker" proceses in top and when executing from my 
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
> 
> Best regards,
> 
> Alastair
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Explain says 8 workers planned, only 1 executed

From

Alastair McKinley

Date:

21 March 2020, 17:58:49

Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as

declare

q text;

output text;

begin

select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone

for output in execute('explain ' || q) loop

raise notice '%',output; -- this plan says 8 workers will be launched

end loop;

return query execute q; -- this launches one worker

select generate_query2() into q;

for output in execute('explain ' || q) loop

raise notice '%',output; -- this plan says 8 workers will be launched

end loop;

return query execute q; -- this also launches one worker

end;

language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions? Does the outer function need to be parallel safe?

Might a stored proc work better?

Best regards,

Alastair

From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: 21 March 2020 17:38
To: Alastair McKinley <a.mckinley@analyticsengines.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Explain says 8 workers planned, only 1 executed

On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client.
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers. However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Explain says 8 workers planned, only 1 executed

From

Jeremy Smith

Date:

21 March 2020, 20:50:15

Unfortunately, return query will never use parallel workers. See: https://stackoverflow.com/q/58079898/895640 and https://www.postgresql.org/message-id/16040-eaacad11fecfb198@postgresql.org

On Sat, Mar 21, 2020 at 1:59 PM Alastair McKinley <a.mckinley@analyticsengines.com> wrote:

Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as
$$
declare
q text;
output text;
begin
    select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone
    for output in execute('explain ' || q) loop
        raise notice '%',output; -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this launches one worker

   select generate_query2() into q;
    for output in execute('explain ' || q) loop
        raise notice '%',output; -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this also launches one worker
end;
language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions? Does the outer function need to be parallel safe?
Might a stored proc work better?

Best regards,

Alastair

From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: 21 March 2020 17:38
To: Alastair McKinley <a.mckinley@analyticsengines.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Explain says 8 workers planned, only 1 executed

On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client.
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers. However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Explain says 8 workers planned, only 1 executed

From

Alastair McKinley

Date:

21 March 2020, 21:16:24

Hi Jeremy,

Thanks for solving the mystery. I think this might be a missing point in section 15.2 in the docs.

I wonder will this ever be improved or should I just write to temporary tables instead of return query?

Best regards,

Alastair

From: Jeremy Smith <jeremy@musicsmith.net>
Sent: 21 March 2020 20:50
To: Alastair McKinley <a.mckinley@analyticsengines.com>
Cc: Adrian Klaver <adrian.klaver@aklaver.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Explain says 8 workers planned, only 1 executed

Unfortunately, return query will never use parallel workers. See: https://stackoverflow.com/q/58079898/895640 and https://www.postgresql.org/message-id/16040-eaacad11fecfb198@postgresql.org

On Sat, Mar 21, 2020 at 1:59 PM Alastair McKinley <a.mckinley@analyticsengines.com> wrote:

Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as
$$
declare
q text;
output text;
begin
    select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone
    for output in execute('explain ' || q) loop
        raise notice '%',output; -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this launches one worker

   select generate_query2() into q;
    for output in execute('explain ' || q) loop
        raise notice '%',output; -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this also launches one worker
end;
language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions? Does the outer function need to be parallel safe?
Might a stored proc work better?

Best regards,

Alastair

From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: 21 March 2020 17:38
To: Alastair McKinley <a.mckinley@analyticsengines.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Explain says 8 workers planned, only 1 executed

On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client.
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers. However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Explain says 8 workers planned, only 1 executed

From

Tom Lane

Date:

22 March 2020, 03:25:53

Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> Thanks for solving the mystery.  I think this might be a missing point in section 15.2 in the docs.
> I wonder will this ever be improved or should I just write to temporary tables instead of return query?

I just posted a patch to improve that [1], but it's not something we'd be
likely to back-patch into existing releases.

            regards, tom lane

[1] https://www.postgresql.org/message-id/1741.1584847383%40sss.pgh.pa.us