parallel "return query" is no good

Started by Robert Haasalmost 9 years ago7 messages

robertmhaas@gmail.com

almost 9 years ago

1 attachment(s)

Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block. As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it. I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I previously remarked on this in
/messages/by-id/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
but I wasn't quite so clear what the whole picture was in that email
as I am now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

no-parallel-return-query.patchapplication/octet-stream; name=no-parallel-return-query.patchDownload

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 49a4e62..8e836a8 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -3023,7 +3023,7 @@ exec_stmt_return_query(PLpgSQL_execstate *estate,
 	if (stmt->query != NULL)
 	{
 		/* static query */
-		exec_run_select(estate, stmt->query, 0, &portal, true);
+		exec_run_select(estate, stmt->query, 0, &portal, false);
 	}
 	else
 	{
@@ -3031,7 +3031,7 @@ exec_stmt_return_query(PLpgSQL_execstate *estate,
 		Assert(stmt->dynquery != NULL);
 		portal = exec_dynquery_with_params(estate, stmt->dynquery,
 										   stmt->params, NULL,
-										   CURSOR_OPT_PARALLEL_OK);
+										   0);
 	}
 
 	/* Use eval_mcontext for tuple conversion work */

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Robert Haas (#1)

Re: parallel "return query" is no good

On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block. As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it. I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

almost 9 years ago

In reply to: Robert Haas (#2)

Re: parallel "return query" is no good

* Robert Haas (robertmhaas@gmail.com) wrote:

On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block. As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it. I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

For my 2c, I'd back-patch it.

Thanks!

Stephen

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Robert Haas (#2)

Re: parallel "return query" is no good

On 2017-03-23 13:03:19 -0400, Robert Haas wrote:

On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block. As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it. I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

I'm +0.5 for backpatching.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Joshua D. Drake

jd@commandprompt.com

almost 9 years ago

In reply to: Robert Haas (#2)

Re: parallel "return query" is no good

On 03/23/2017 10:03 AM, Robert Haas wrote:

On Thu, Mar 23, 2017 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block. As it turns out, the analysis that led
to this decision was totally wrong-headed, because the plan will
always be executed using SPI_cursor_fetch(portal, true, 50), which
will cause ExecutePlan() to get invoked with a count of 50, which will
cause it to run the parallel plan serially, without workers.
Therefore, passing CURSOR_OPT_PARALLEL_OK is a bad idea here; all it
can do is cause us to pick a parallel plan that's slow when executed
serially instead of the best serial plan.

The attached patch fixes it. I plan to commit this and back-patch it
to 9.6, barring objections or better ideas.

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

I think the greater good of a fix applies here. +1 to 9.6.

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

almost 9 years ago

In reply to: Robert Haas (#2)

Re: parallel "return query" is no good

Robert Haas wrote:

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

I think that the chances of someone depending on a parallel plan running
serially by accident which is better than the non-parallel plan, are
pretty slim.

+1 for back-patching.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Alvaro Herrera (#6)

Re: parallel "return query" is no good

On Thu, Mar 23, 2017 at 1:53 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Robert Haas wrote:

I guess the downside of back-patching this is that it could cause a
plan change for somebody which ends up being worse. On the whole,
serial execution of queries intended to be run in parallel isn't
likely to work out well, but it's always possible somebody has a cases
where it happens to be winning, and this could break it. So maybe I
should do this only in master? Thoughts?

I think that the chances of someone depending on a parallel plan running
serially by accident which is better than the non-parallel plan, are
pretty slim.

+1 for back-patching.

All right, done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers