Parallel workers via functions?

Started by Jim Vannsabout 1 year ago3 messagesgeneral
Jump to latest
#1Jim Vanns
jvanns@ilm.com

Apologies for the rushed question ...

If I have a function that is marked 'stable parallel safe' and returns
a table, can a calling function or procedure (marked volatile parallel
unsafe) still take advantage of the parallel workers from the first
function - as the data source. I.e.

func_a(); // selects, returns table, parallel safe
func_b() {
insert into foo
select * from func_a(); // Will func_a still execute parallel
workers to fetch the data?
}

Or even if func_b() uses 'create temporary table as select * from
func_a()' and then insert?

I ask because when I simply call func_a() from a psql shell, I see the
parallel workers run and everything is nice and swift. But when called
from a data-modifying function like func_b(), no workers are spawned
:( Even from the read-part of the code.

Are there differences in functions vs. stored procedures that might
affect the behaviour of the planner to disregard workers?

Cheers

Jim

--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jim Vanns (#1)
Re: Parallel workers via functions?

On Mon, 2025-01-27 at 18:08 +0000, Jim Vanns wrote:

If I have a function that is marked 'stable parallel safe' and returns
a table, can a calling function or procedure (marked volatile parallel
unsafe) still take advantage of the parallel workers from the first
function - as the data source. I.e.

func_a(); // selects, returns table, parallel safe
func_b() {
insert into foo
select * from func_a(); // Will func_a still execute parallel
workers to fetch the data?
}

Or even if func_b() uses 'create temporary table as select * from
func_a()' and then insert?

I ask because when I simply call func_a() from a psql shell, I see the
parallel workers run and everything is nice and swift. But when called
from a data-modifying function like func_b(), no workers are spawned
:( Even from the read-part of the code.

Are there differences in functions vs. stored procedures that might
affect the behaviour of the planner to disregard workers?

See https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

The problem here is the INSERT. Data modifying statements won't use
parallel query.

There are exceptions: CREATE TABLE ... AS SELECT ... should be able
to use parallel query.

Yours,
Laurenz Albe

#3Jim Vanns
jvanns@ilm.com
In reply to: Laurenz Albe (#2)
Re: Parallel workers via functions?

Thanks for the reply Laurenz. Inline replies follow...

On Tue, 28 Jan 2025 at 04:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Mon, 2025-01-27 at 18:08 +0000, Jim Vanns wrote:

If I have a function that is marked 'stable parallel safe' and returns
a table, can a calling function or procedure (marked volatile parallel
unsafe) still take advantage of the parallel workers from the first
function - as the data source. I.e.

func_a(); // selects, returns table, parallel safe
func_b() {
insert into foo
select * from func_a(); // Will func_a still execute parallel
workers to fetch the data?
}

Or even if func_b() uses 'create temporary table as select * from
func_a()' and then insert?

I ask because when I simply call func_a() from a psql shell, I see the
parallel workers run and everything is nice and swift. But when called
from a data-modifying function like func_b(), no workers are spawned
:( Even from the read-part of the code.

Are there differences in functions vs. stored procedures that might
affect the behaviour of the planner to disregard workers?

See https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

Thanks. Yup, read that. Seems easy enough to understand... however...

The problem here is the INSERT. Data modifying statements won't use
parallel query.

OK, that's clear enough.

There are exceptions: CREATE TABLE ... AS SELECT ... should be able
to use parallel query.

I've been experimenting with this. The problem deepens... It seems
that actually, it's the function itself - func_a() in my example
above. Even simply calling that from psql doesn't spawn parallel
workers to run as part of the query defined in the funcion body. But
if I copy the body of the function and paste it into a psql shell, it
does parallelise. This function is marked STABLE PARALLEL SAFE though.
Are there limitations or restrictions I'm missing!? I'll try to find
the time to provide a MRP but I'm hoping somebody will just magically
know what the problem is or at least could be!

So... I am still confused! This is PG 15.5 BTW.

Jim

Yours,
Laurenz Albe

--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London