[PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

Started by DIPESH DHAMELIYA8 months ago9 messages
#1DIPESH DHAMELIYA
dipeshdhameliya125@gmail.com
1 attachment(s)

Hello everyone,

With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql
function that returns scalar value would not be able to use parallelism to
evaluate a return statement. It will not be considered for parallel
execution because we are passing maxtuples = 2 to exec_run_select from
exec_eval_expr to evaluate the return expression of the function.

Call stake to ExecutePlan -

#0 ExecutePlan (queryDesc=0x589c390, operation=CMD_SELECT,
sendTuples=true, numberTuples=2, direction=ForwardScanDirection,
dest=0xe15ca0 <spi_printtupDR>) at execMain.c:1654

#1 0x000000000075edb6 in standard_ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
execMain.c:366

#2 0x00007f5749c9b8d8 in explain_ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
auto_explain.c:334

#3 0x000000000075ec25 in ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
execMain.c:310

#4 0x00000000007c4a48 in _SPI_pquery (queryDesc=0x589c390,
fire_triggers=true, tcount=2) at spi.c:2980

#5 0x00000000007c44a9 in _SPI_execute_plan (plan=0x5878780,
options=0x7ffc6ad467e0, snapshot=0x0, crosscheck_snapshot=0x0,
fire_triggers=true) at spi.c:2747

#6 0x00000000007c135f in SPI_execute_plan_with_paramlist (plan=0x5878780,
params=0x0, read_only=false, tcount=2) at spi.c:765

#7 0x00007f5749eb4a8b in exec_run_select (estate=0x7ffc6ad46ba0,
expr=0x5892b80, maxtuples=2, portalP=0x0) at pl_exec.c:5840 <-- maxtuples =
2

#8 0x00007f5749eb46fe in exec_eval_expr (estate=0x7ffc6ad46ba0,
expr=0x5892b80, isNull=0x7ffc6ad46bc0, rettype=0x7ffc6ad46bc4,
rettypmod=0x7ffc6ad468e8) at pl_exec.c:5734

Consider the following simple repro –

postgres=# create table test_tab(a int);

CREATE TABLE

postgres=# insert into test_tab (a) SELECT generate_series(1, 1000000);

INSERT 0 1000000

postgres=# analyse test_tab;

ANALYZE

postgres=# create function test_plpgsql() returns int

language plpgsql

as

$$

begin

return (select count(*) from test_tab where a between 5.0 and 999999.0);

end;

$$;

postgres=# LOAD 'auto_explain';

LOAD

postgres=# SET auto_explain.log_min_duration = 0;

SET

postgres=# SET auto_explain.log_analyze = true;

SET

postgres=# SET auto_explain.log_nested_statements = true;

SET

postgres=# select test_plpgsql();

test_plpgsql

--------------

999995

(1 row)

Plan logged in logfile -

Query Text: (select count(*) from test_tab where a between 5.0 and
999999.0)

Result (cost=13763.77..13763.78 rows=1 width=8) (actual
time=912.108..912.110 rows=1 loops=1)

InitPlan 1

-> Finalize Aggregate (cost=13763.76..13763.77 rows=1 width=8)
(actual time=912.103..912.104 rows=1 loops=1)

-> Gather (cost=13763.54..13763.75 rows=2 width=8) (actual
time=912.096..912.098 rows=1 loops=1)

Workers Planned: 2

*Workers Launched: 0*

-> Partial Aggregate (cost=12763.54..12763.55 rows=1
width=8) (actual time=912.095..912.096 rows=1 loops=1)

-> Parallel Seq Scan on test_tab
(cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253
rows=999995 loops=1)

Filter: (((a)::numeric >= 5.0) AND
((a)::numeric <= 999999.0))

Rows Removed by Filter: 5

Patch to fix this issue is attached. Proposed fix should not cause any
regression because the number of returned rows is anyway being checked
later inside exec_eval_expr(…).

Plan logged after fix –

Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0)

Result (cost=13763.77..13763.78 rows=1 width=8) (actual
time=324.397..328.007 rows=1.00 loops=1)

InitPlan 1

-> Finalize Aggregate (cost=13763.76..13763.77 rows=1 width=8)
(actual time=324.391..327.999 rows=1.00 loops=1)

-> Gather (cost=13763.54..13763.75 rows=2 width=8) (actual
time=324.052..327.989 rows=3.00 loops=1)

Workers Planned: 2

*Workers Launched: 2*

-> Partial Aggregate (cost=12763.54..12763.55 rows=1
width=8) (actual time=320.254..320.255 rows=1.00 loops=3)

-> Parallel Seq Scan on test_tab
(cost=0.00..12758.33 rows=2083 width=0) (actual time=0.029..286.410
rows=333331.67 loops=3)

Filter: (((a)::numeric >= 5.0) AND
((a)::numeric <= 999999.0))

Rows Removed by Filter: 2

Thanks & Regards,

Dipesh

Attachments:

0001-Allow-parallelism-for-plpgsql-return-expression-afte.patchapplication/octet-stream; name=0001-Allow-parallelism-for-plpgsql-return-expression-afte.patchDownload
From 75c7ac00130ea4d38444f833a5b9b743592e9591 Mon Sep 17 00:00:00 2001
From: Dipesh Dhameliya <dipeshdhameliya125@gmail.com>
Date: Mon, 5 May 2025 10:27:57 +0530
Subject: [PATCH] Allow parallelism for plpgsql return expression after commit
 556f7b7

With the commit 556f7b7, we unintentionally blocked parallelism to
evaluate plpgsql return expression because maxtuples = 2 is being passed
to exec_run_select(...) from exec_eval_expr(...) to evaluate the return
expression of the plpgsql function.

Idea to fix this issue to pass maxtuples = 0. It is safe to do it
because number of processed rows is anyway being checked later in
exec_eval_expr(...). But with this, there is not real caller remained
which calls exec_run_select(...) with maxtuple != 0 so updated definition
of exec_run_select(...) to remove maxtuples argument.

Signed-off-by: Dipesh Dhameliya <dipeshdhameliya125@gmail.com>
---
 src/pl/plpgsql/src/pl_exec.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index bb99781c56e..75a589bfd59 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -385,7 +385,7 @@ static Datum exec_eval_expr(PLpgSQL_execstate *estate,
 							Oid *rettype,
 							int32 *rettypmod);
 static int	exec_run_select(PLpgSQL_execstate *estate,
-							PLpgSQL_expr *expr, long maxtuples, Portal *portalP);
+							PLpgSQL_expr *expr, Portal *portalP);
 static int	exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
 						   Portal portal, bool prefetch_ok);
 static ParamListInfo setup_param_list(PLpgSQL_execstate *estate,
@@ -2181,7 +2181,7 @@ exec_stmt_perform(PLpgSQL_execstate *estate, PLpgSQL_stmt_perform *stmt)
 {
 	PLpgSQL_expr *expr = stmt->expr;
 
-	(void) exec_run_select(estate, expr, 0, NULL);
+	(void) exec_run_select(estate, expr, NULL);
 	exec_set_found(estate, (estate->eval_processed != 0));
 	exec_eval_cleanup(estate);
 
@@ -2844,7 +2844,7 @@ exec_stmt_fors(PLpgSQL_execstate *estate, PLpgSQL_stmt_fors *stmt)
 	/*
 	 * Open the implicit cursor for the statement using exec_run_select
 	 */
-	exec_run_select(estate, stmt->query, 0, &portal);
+	exec_run_select(estate, stmt->query, &portal);
 
 	/*
 	 * Execute the loop
@@ -5703,7 +5703,7 @@ exec_eval_expr(PLpgSQL_execstate *estate,
 	/*
 	 * Else do it the hard way via exec_run_select
 	 */
-	rc = exec_run_select(estate, expr, 2, NULL);
+	rc = exec_run_select(estate, expr, NULL);
 	if (rc != SPI_OK_SELECT)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
@@ -5761,7 +5761,7 @@ exec_eval_expr(PLpgSQL_execstate *estate,
  */
 static int
 exec_run_select(PLpgSQL_execstate *estate,
-				PLpgSQL_expr *expr, long maxtuples, Portal *portalP)
+				PLpgSQL_expr *expr, Portal *portalP)
 {
 	ParamListInfo paramLI;
 	int			rc;
@@ -5810,7 +5810,7 @@ exec_run_select(PLpgSQL_execstate *estate,
 	 * Execute the query
 	 */
 	rc = SPI_execute_plan_with_paramlist(expr->plan, paramLI,
-										 estate->readonly_func, maxtuples);
+										 estate->readonly_func, 0);
 	if (rc != SPI_OK_SELECT)
 	{
 		/*
-- 
2.42.0

#2Dilip Kumar
dilipbalaut@gmail.com
In reply to: DIPESH DHAMELIYA (#1)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

On Mon, May 5, 2025 at 11:19 AM DIPESH DHAMELIYA
<dipeshdhameliya125@gmail.com> wrote:

Hello everyone,

With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql function that returns scalar value would not be able to use parallelism to evaluate a return statement. It will not be considered for parallel execution because we are passing maxtuples = 2 to exec_run_select from exec_eval_expr to evaluate the return expression of the function.

I could not find commit '556f7b7bc18d34ddec45392965c3b3038206bb62' in
git log on the master branch, but here is my analysis after looking at
your patch.

I don't think we can remove the 'maxtuples' parameter from
exec_run_select(). In this particular case, the query itself is
returning a single tuple, so we are good. Still, in other cases where
the query returns more tuples, it makes sense to stop the execution as
soon as we have got enough tuples otherwise, it will do the execution
until we produce all the tuples. Consider the below example where we
just need to use the first tuple, but if we apply your patch, the
executor will end up processing all the tuples, and it will impact the
performance. So IMHO, the benefit you get by enabling a parallelism
in some cases may hurt badly in other cases, as you will end up
processing more tuples than required.

CREATE OR REPLACE FUNCTION get_first_user_email()
RETURNS TEXT AS $$
DECLARE
user_email TEXT;
BEGIN
user_email = (SELECT email FROM users);
RETURN user_email;
END;
$$ LANGUAGE plpgsql;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#3DIPESH DHAMELIYA
dipeshdhameliya125@gmail.com
In reply to: Dilip Kumar (#2)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

On Tue, May 20, 2025 at 11:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, May 5, 2025 at 11:19 AM DIPESH DHAMELIYA
<dipeshdhameliya125@gmail.com> wrote:

Hello everyone,

With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql function that returns scalar value would not be able to use parallelism to evaluate a return statement. It will not be considered for parallel execution because we are passing maxtuples = 2 to exec_run_select from exec_eval_expr to evaluate the return expression of the function.

I could not find commit '556f7b7bc18d34ddec45392965c3b3038206bb62' in
git log on the master branch, but here is my analysis after looking at
your patch.

Here is the github link to commit -
https://github.com/postgres/postgres/commit/556f7b7bc18d34ddec45392965c3b3038206bb62
and discussion -
/messages/by-id/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp

I don't think we can remove the 'maxtuples' parameter from
exec_run_select(). In this particular case, the query itself is
returning a single tuple, so we are good. Still, in other cases where
the query returns more tuples, it makes sense to stop the execution as
soon as we have got enough tuples otherwise, it will do the execution
until we produce all the tuples. Consider the below example where we
just need to use the first tuple, but if we apply your patch, the
executor will end up processing all the tuples, and it will impact the
performance. So IMHO, the benefit you get by enabling a parallelism
in some cases may hurt badly in other cases, as you will end up
processing more tuples than required.

CREATE OR REPLACE FUNCTION get_first_user_email()
RETURNS TEXT AS $$
DECLARE
user_email TEXT;
BEGIN
user_email = (SELECT email FROM users);
RETURN user_email;
END;
$$ LANGUAGE plpgsql;

I understand but aren't we blocking parallelism for genuine cases with
a very complex query where parallelism can help to some extent to
improve execution time? Users can always rewrite a query (for example
using TOP clause) if they are expecting one tuple to be returned.

#4Dilip Kumar
dilipbalaut@gmail.com
In reply to: DIPESH DHAMELIYA (#3)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

On Tue, May 20, 2025 at 1:45 PM DIPESH DHAMELIYA
<dipeshdhameliya125@gmail.com> wrote:

On Tue, May 20, 2025 at 11:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I don't think we can remove the 'maxtuples' parameter from
exec_run_select(). In this particular case, the query itself is
returning a single tuple, so we are good. Still, in other cases where
the query returns more tuples, it makes sense to stop the execution as
soon as we have got enough tuples otherwise, it will do the execution
until we produce all the tuples. Consider the below example where we
just need to use the first tuple, but if we apply your patch, the
executor will end up processing all the tuples, and it will impact the
performance. So IMHO, the benefit you get by enabling a parallelism
in some cases may hurt badly in other cases, as you will end up
processing more tuples than required.

CREATE OR REPLACE FUNCTION get_first_user_email()
RETURNS TEXT AS $$
DECLARE
user_email TEXT;
BEGIN
user_email = (SELECT email FROM users);
RETURN user_email;
END;
$$ LANGUAGE plpgsql;

I understand but aren't we blocking parallelism for genuine cases with
a very complex query where parallelism can help to some extent to
improve execution time? Users can always rewrite a query (for example
using TOP clause) if they are expecting one tuple to be returned.

IMHO, you are targeting the fix at the wrong place. Basically if we
accept this fix means the already existing functions for the users
will start performing bad for enabling the parallelism in some other
cases where they will see benefits, so it might not be acceptable by
many users to change the application and rewrite all the procedures to
get the same performance they were getting earlier.

I would not say that your concern is wrong because for internal
aggregate initplan we are processing all the tuple so logically it
should use the parallel plan, so IMHO we need to target the fix for
enabling the parallelism for initplan in cases where outer query has
input the max number of tuple because that limit is for the outer plan
not for the initplan.

--
Regards,
Dilip Kumar
Google

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#4)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Tue, May 20, 2025 at 1:45 PM DIPESH DHAMELIYA
<dipeshdhameliya125@gmail.com> wrote:

I understand but aren't we blocking parallelism for genuine cases with
a very complex query where parallelism can help to some extent to
improve execution time? Users can always rewrite a query (for example
using TOP clause) if they are expecting one tuple to be returned.

IMHO, you are targeting the fix at the wrong place. Basically if we
accept this fix means the already existing functions for the users
will start performing bad for enabling the parallelism in some other
cases where they will see benefits, so it might not be acceptable by
many users to change the application and rewrite all the procedures to
get the same performance they were getting earlier.

I noticed this patch in the commitfest. I agree it's a bit
unfortunate that that bug fix disabled parallelism for queries issued
via exec_eval_expr. There isn't a lot of choice if we retain
the maxtuples = 2 setting, since (as noted in the thread leading up
to that commit) the executor can't safely use parallelism alongside
a return-tuple-limit setting. So it was outright unsafe before.

However ... I'm not sure I buy the argument that removing the
maxtuples = 2 bit will be problematic for performance. If the query
returns more than one tuple then exec_eval_expr is going to throw
an error, so do we really need to optimize getting to that error?
Especially at the cost of pessimizing other, actually-useful cases?

As for the patch itself, I'm not sure about removing the maxtuples
argument altogether, since it seems to me that we might still have
some use for maxtuples limits in plpgsql in future. A one-liner
change to make exec_eval_expr pass 0 seems sufficient.

I would not say that your concern is wrong because for internal
aggregate initplan we are processing all the tuple so logically it
should use the parallel plan, so IMHO we need to target the fix for
enabling the parallelism for initplan in cases where outer query has
input the max number of tuple because that limit is for the outer plan
not for the initplan.

There's no initplan in the given test case, so I don't see how
that idea is going to fix it. Also, allowing initplans to begin
parallelism when the outer query isn't using parallelism seems
like it'd be fraught with problems. It certainly couldn't be
something we'd back-patch.

regards, tom lane

#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Tom Lane (#5)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

On Fri, Jul 4, 2025 at 1:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Tue, May 20, 2025 at 1:45 PM DIPESH DHAMELIYA
<dipeshdhameliya125@gmail.com> wrote:

I understand but aren't we blocking parallelism for genuine cases with
a very complex query where parallelism can help to some extent to
improve execution time? Users can always rewrite a query (for example
using TOP clause) if they are expecting one tuple to be returned.

IMHO, you are targeting the fix at the wrong place. Basically if we
accept this fix means the already existing functions for the users
will start performing bad for enabling the parallelism in some other
cases where they will see benefits, so it might not be acceptable by
many users to change the application and rewrite all the procedures to
get the same performance they were getting earlier.

I noticed this patch in the commitfest. I agree it's a bit
unfortunate that that bug fix disabled parallelism for queries issued
via exec_eval_expr. There isn't a lot of choice if we retain
the maxtuples = 2 setting, since (as noted in the thread leading up
to that commit) the executor can't safely use parallelism alongside
a return-tuple-limit setting. So it was outright unsafe before.

However ... I'm not sure I buy the argument that removing the
maxtuples = 2 bit will be problematic for performance. If the query
returns more than one tuple then exec_eval_expr is going to throw
an error, so do we really need to optimize getting to that error?
Especially at the cost of pessimizing other, actually-useful cases?

I was thinking about the cases where the query returns just 1 row but
has to scan the entire table, but after thinking more it will behave
the same because we are passing maxtuple=2 not 1 it means it anyway
has to scan the entire table in the non error cases.

As for the patch itself, I'm not sure about removing the maxtuples
argument altogether, since it seems to me that we might still have
some use for maxtuples limits in plpgsql in future. A one-liner
change to make exec_eval_expr pass 0 seems sufficient.

I would not say that your concern is wrong because for internal
aggregate initplan we are processing all the tuple so logically it
should use the parallel plan, so IMHO we need to target the fix for
enabling the parallelism for initplan in cases where outer query has
input the max number of tuple because that limit is for the outer plan
not for the initplan.

There's no initplan in the given test case, so I don't see how
that idea is going to fix it. Also, allowing initplans to begin
parallelism when the outer query isn't using parallelism seems
like it'd be fraught with problems. It certainly couldn't be
something we'd back-patch.

If you see the original problematic case[1]Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0) Result (cost=13763.77..13763.78 rows=1 width=8) (actual time=912.108..912.110 rows=1 loops=1) InitPlan 1 -> Finalize Aggregate (cost=13763.76..13763.77 rows=1 width=8) (actual time=912.103..912.104 rows=1 loops=1) -> Gather (cost=13763.54..13763.75 rows=2 width=8) (actual time=912.096..912.098 rows=1 loops=1) Workers Planned: 2 Workers Launched: 0 -> Partial Aggregate (cost=12763.54..12763.55 rows=1 width=8) (actual time=912.095..912.096 rows=1 loops=1) -> Parallel Seq Scan on test_tab (cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253 rows=999995 loops=1) Filter: (((a)::numeric >= 5.0) AND ((a)::numeric <= 999999.0)) Rows Removed by Filter: 5 shown by dipesh, where the
parallel plan is selected for the initPlan1 but it could not launch
the worker because of the below check[2]ExecutePlan() { .... in ExecutePlan. So here my
concern was that the number of "max tuple" was passed for the top
level plan, however the parallelism is restricted for the InitPlan as
well.

If I understand the code comments correctly, they state that "Parallel
mode only supports complete execution of a plan." Given that
"InitPlan1" does run to completion, it seems the issue here is that
when the top-level plan isn't fully executed, it restricts parallelism
for its sub-plans or init-plans, even if those sub-plans are running
to completion.

It seems the straightforward solution might be what Dipesh suggested.
I initially opposed it, fearing it would negatively impact the
performance of non-error cases, but I now realize that's not the case.
Furthermore, I believe that if we avoid passing maxtuple=2, it could
also enable parallelism for other scenarios, such as when the main
plan returns only a single tuple but requires a full table scan. I
believe that last scenario is precisely what you had in mind.

I see two distinct issues at play here. First, parallelism is blocked
because maxtuple=2 is being passed from exec_eval_expr(). Second, if a
maximum tuple limit is applied to the parent plan, it inadvertently
restricts parallelism for the init plan as well, even when that init
plan needs to execute fully..

If we address the first issue, where maxtuple=2 is passed from
exec_eval_expr(), it will resolve both problems in this specific
scenario. However, the second problem, where limiting max tuple on the
top-level plan impacts the init plan's parallelism, will persist in
other cases.

[1]: Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0) Result (cost=13763.77..13763.78 rows=1 width=8) (actual time=912.108..912.110 rows=1 loops=1) InitPlan 1 -> Finalize Aggregate (cost=13763.76..13763.77 rows=1 width=8) (actual time=912.103..912.104 rows=1 loops=1) -> Gather (cost=13763.54..13763.75 rows=2 width=8) (actual time=912.096..912.098 rows=1 loops=1) Workers Planned: 2 Workers Launched: 0 -> Partial Aggregate (cost=12763.54..12763.55 rows=1 width=8) (actual time=912.095..912.096 rows=1 loops=1) -> Parallel Seq Scan on test_tab (cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253 rows=999995 loops=1) Filter: (((a)::numeric >= 5.0) AND ((a)::numeric <= 999999.0)) Rows Removed by Filter: 5
Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0)
Result (cost=13763.77..13763.78 rows=1 width=8) (actual
time=912.108..912.110 rows=1 loops=1)
InitPlan 1
-> Finalize Aggregate (cost=13763.76..13763.77 rows=1
width=8) (actual time=912.103..912.104 rows=1 loops=1)
-> Gather (cost=13763.54..13763.75 rows=2 width=8)
(actual time=912.096..912.098 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 0
-> Partial Aggregate (cost=12763.54..12763.55
rows=1 width=8) (actual time=912.095..912.096 rows=1 loops=1)
-> Parallel Seq Scan on test_tab
(cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253
rows=999995 loops=1)
Filter: (((a)::numeric >= 5.0) AND
((a)::numeric <= 999999.0))
Rows Removed by Filter: 5

[2]: ExecutePlan() { ....
ExecutePlan()
{
....

/*
* Set up parallel mode if appropriate.
*
* Parallel mode only supports complete execution of a plan. If we've
* already partially executed it, or if the caller asks us to exit early,
* we must force the plan to run without parallelism.
*/
if (queryDesc->already_executed || numberTuples != 0)
use_parallel_mode = false;
else
use_parallel_mode = queryDesc->plannedstmt->parallelModeNeeded;
}

--
Regards,
Dilip Kumar
Google

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#6)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Fri, Jul 4, 2025 at 1:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

There's no initplan in the given test case, so I don't see how
that idea is going to fix it. Also, allowing initplans to begin
parallelism when the outer query isn't using parallelism seems
like it'd be fraught with problems. It certainly couldn't be
something we'd back-patch.

If you see the original problematic case[1] shown by dipesh, where the
parallel plan is selected for the initPlan1 but it could not launch
the worker because of the below check[2] in ExecutePlan. So here my
concern was that the number of "max tuple" was passed for the top
level plan, however the parallelism is restricted for the InitPlan as
well.

Oh, looking back, I see that indeed the original example happened to
produce a plan that involves an initplan. But there are plenty of
variants that won't, such as the arguably-more-idiomatic

return count(*) from test_tab where a between 5.0 and 999999.0;

So I think your emphasis on improving that case is misplaced to start
with.

If I understand the code comments correctly, they state that "Parallel
mode only supports complete execution of a plan." Given that
"InitPlan1" does run to completion, it seems the issue here is that
when the top-level plan isn't fully executed, it restricts parallelism
for its sub-plans or init-plans, even if those sub-plans are running
to completion.

I repeat that I don't think "allow an initplan to run in parallel even
if the outer query can't" is a particularly sane way to tackle this
problem. For starters, ExecutorRun calls EnterParallelMode() and
ExitParallelMode() globally for the entire query. We'd have to
rethink how that works and when it would be safe to call those.
Using the query-global estate->es_use_parallel_mode flag wouldn't work
either. Thought would also be needed about which can't-do-parallelism
restrictions would be safe to override. I agree that an outer
numberTuples restriction needn't be considered, but I'm not sure that
any others could be ignored.

So on the whole it seems like a research project requiring nontrivial
effort and probably yielding only marginal gains. It's certainly not
going to yield a back-patchable fix for this performance regression.

regards, tom lane

#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Tom Lane (#7)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

On Fri, Jul 4, 2025 at 9:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Fri, Jul 4, 2025 at 1:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

There's no initplan in the given test case, so I don't see how
that idea is going to fix it. Also, allowing initplans to begin
parallelism when the outer query isn't using parallelism seems
like it'd be fraught with problems. It certainly couldn't be
something we'd back-patch.

If you see the original problematic case[1] shown by dipesh, where the
parallel plan is selected for the initPlan1 but it could not launch
the worker because of the below check[2] in ExecutePlan. So here my
concern was that the number of "max tuple" was passed for the top
level plan, however the parallelism is restricted for the InitPlan as
well.

Oh, looking back, I see that indeed the original example happened to
produce a plan that involves an initplan. But there are plenty of
variants that won't, such as the arguably-more-idiomatic

return count(*) from test_tab where a between 5.0 and 999999.0;

So I think your emphasis on improving that case is misplaced to start
with.

If I understand the code comments correctly, they state that "Parallel
mode only supports complete execution of a plan." Given that
"InitPlan1" does run to completion, it seems the issue here is that
when the top-level plan isn't fully executed, it restricts parallelism
for its sub-plans or init-plans, even if those sub-plans are running
to completion.

I repeat that I don't think "allow an initplan to run in parallel even
if the outer query can't" is a particularly sane way to tackle this
problem. For starters, ExecutorRun calls EnterParallelMode() and
ExitParallelMode() globally for the entire query. We'd have to
rethink how that works and when it would be safe to call those.
Using the query-global estate->es_use_parallel_mode flag wouldn't work
either. Thought would also be needed about which can't-do-parallelism
restrictions would be safe to override. I agree that an outer
numberTuples restriction needn't be considered, but I'm not sure that
any others could be ignored.

So on the whole it seems like a research project requiring nontrivial
effort and probably yielding only marginal gains. It's certainly not
going to yield a back-patchable fix for this performance regression.

Yeah that's correct.

--
Regards,
Dilip Kumar
Google

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#8)
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Fri, Jul 4, 2025 at 9:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So on the whole it seems like a research project requiring nontrivial
effort and probably yielding only marginal gains. It's certainly not
going to yield a back-patchable fix for this performance regression.

Yeah that's correct.

OK. I pushed the one-liner (plus comment) fix. If anyone does
feel like looking into allowing subplans to run in parallel in
this context, it's not stopping them from doing so.

regards, tom lane