Thread: Parallel workers via functions?

Parallel workers via functions?

From
Jim Vanns
Date:
Apologies for the rushed question ...

If I have a function that is marked 'stable parallel safe' and returns
a table, can a calling function or procedure (marked volatile parallel
unsafe) still take advantage of the parallel workers from the first
function - as the data source. I.e.

func_a(); // selects, returns table, parallel safe
func_b() {
   insert into foo
   select * from func_a(); // Will func_a still execute parallel
workers to fetch the data?
}

Or even if func_b() uses 'create temporary table as select * from
func_a()' and then insert?

I ask because when I simply call func_a() from a psql shell, I see the
parallel workers run and everything is nice and swift. But when called
from a data-modifying function like func_b(), no workers are spawned
:( Even from the read-part of the code.

Are there differences in functions vs. stored procedures that might
affect the behaviour of the planner to disregard workers?

Cheers

Jim

-- 
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London



Re: Parallel workers via functions?

From
Laurenz Albe
Date:
On Mon, 2025-01-27 at 18:08 +0000, Jim Vanns wrote:
> If I have a function that is marked 'stable parallel safe' and returns
> a table, can a calling function or procedure (marked volatile parallel
> unsafe) still take advantage of the parallel workers from the first
> function - as the data source. I.e.
>
> func_a(); // selects, returns table, parallel safe
> func_b() {
>    insert into foo
>    select * from func_a(); // Will func_a still execute parallel
> workers to fetch the data?
> }
>
> Or even if func_b() uses 'create temporary table as select * from
> func_a()' and then insert?
>
> I ask because when I simply call func_a() from a psql shell, I see the
> parallel workers run and everything is nice and swift. But when called
> from a data-modifying function like func_b(), no workers are spawned
> :( Even from the read-part of the code.
>
> Are there differences in functions vs. stored procedures that might
> affect the behaviour of the planner to disregard workers?

See https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

The problem here is the INSERT.  Data modifying statements won't use
parallel query.

There are exceptions: CREATE TABLE ... AS SELECT ... should be able
to use parallel query.

Yours,
Laurenz Albe



Re: Parallel workers via functions?

From
Jim Vanns
Date:
Thanks for the reply Laurenz. Inline replies follow...

On Tue, 28 Jan 2025 at 04:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Mon, 2025-01-27 at 18:08 +0000, Jim Vanns wrote:
> > If I have a function that is marked 'stable parallel safe' and returns
> > a table, can a calling function or procedure (marked volatile parallel
> > unsafe) still take advantage of the parallel workers from the first
> > function - as the data source. I.e.
> >
> > func_a(); // selects, returns table, parallel safe
> > func_b() {
> >    insert into foo
> >    select * from func_a(); // Will func_a still execute parallel
> > workers to fetch the data?
> > }
> >
> > Or even if func_b() uses 'create temporary table as select * from
> > func_a()' and then insert?
> >
> > I ask because when I simply call func_a() from a psql shell, I see the
> > parallel workers run and everything is nice and swift. But when called
> > from a data-modifying function like func_b(), no workers are spawned
> > :( Even from the read-part of the code.
> >
> > Are there differences in functions vs. stored procedures that might
> > affect the behaviour of the planner to disregard workers?
>
> See https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

Thanks. Yup, read that. Seems easy enough to understand... however...

> The problem here is the INSERT.  Data modifying statements won't use
> parallel query.

OK, that's clear enough.

> There are exceptions: CREATE TABLE ... AS SELECT ... should be able
> to use parallel query.

I've been experimenting with this. The problem deepens... It seems
that actually, it's the function itself - func_a() in my example
above. Even simply calling that from psql doesn't spawn parallel
workers to run as part of the query defined in the funcion body. But
if I copy the body of the function and paste it into a psql shell, it
does parallelise. This function is marked STABLE PARALLEL SAFE though.
Are there limitations or restrictions I'm missing!? I'll try to find
the time to provide a MRP but I'm hoping somebody will just magically
know what the problem is or at least could be!

So... I am still confused! This is PG 15.5 BTW.

Jim

> Yours,
> Laurenz Albe




--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London