Prepared Statement support for Parallel query

Started by Amit Kapilaalmost 10 years ago10 messages

amit.kapila16@gmail.com

almost 10 years ago

2 attachment(s)

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution. Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support. I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program. This program expects some
data to be generated before-hand and the information of same is
added in file-header.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

prepared_stmt_parallel_query_v1.patchapplication/octet-stream; name=prepared_stmt_parallel_query_v1.patchDownload

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index cec37ce..b01051d 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -159,7 +159,7 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString)
 					   nargs,
 					   NULL,
 					   NULL,
-					   0,		/* default cursor options */
+					   CURSOR_OPT_PARALLEL_OK,	/* allow parallel mode */
 					   true);	/* fixed result */
 
 	/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 390816b..115166b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1381,7 +1381,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 					   numParams,
 					   NULL,
 					   NULL,
-					   0,		/* default cursor options */
+					   CURSOR_OPT_PARALLEL_OK,	/* allow parallel mode */
 					   true);	/* fixed result */
 
 	/* If we got a cancel signal during analysis, quit */

prepare_parallel_query.ctext/x-csrc; charset=US-ASCII; name=prepare_parallel_query.cDownload

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#1)

Re: Prepared Statement support for Parallel query

On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution. Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support. I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program. This program expects some
data to be generated before-hand and the information of same is
added in file-header.

Hmm. I agree we should change exec_parse_message like this, but
changing PrepareQuery seems wrong. I mean, there's a very good chance
that a parse message will be followed by an Execute message with a
zero row count, so we'll get parallel execution. But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#1)

Re: Prepared Statement support for Parallel query

On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution. Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support. I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program. This program expects some
data to be generated before-hand and the information of same is
added in file-header.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#2)

Re: Prepared Statement support for Parallel query

On Wed, Feb 24, 2016 at 7:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution. Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support. I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program. This program expects some
data to be generated before-hand and the information of same is
added in file-header.

Hmm. I agree we should change exec_parse_message like this, but
changing PrepareQuery seems wrong. I mean, there's a very good chance
that a parse message will be followed by an Execute message with a
zero row count, so we'll get parallel execution.

Yes and I think libpq doesn't even provide a interface to specify row
count for prepared statements.

But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

After PREPARE, user will execute the statement using EXECUTE and
I don't see how user can decide number of rows to fetch which can
influence the execution. Can you please elaborate your point more
and what is your expectation for the same?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#4)

Re: Prepared Statement support for Parallel query

On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

After PREPARE, user will execute the statement using EXECUTE and
I don't see how user can decide number of rows to fetch which can
influence the execution. Can you please elaborate your point more
and what is your expectation for the same?

Argh. I'm getting confused between prepared statements and cursors.
So if the user does PREPARE followed by EXECUTE, then that is OK. The
problem is only if they use DECLARE .. CURSOR FOR, which your patch
doesn't affect.

So, committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Robert Haas (#5)

Re: Prepared Statement support for Parallel query

On Thu, Feb 25, 2016 at 1:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

After PREPARE, user will execute the statement using EXECUTE and
I don't see how user can decide number of rows to fetch which can
influence the execution. Can you please elaborate your point more
and what is your expectation for the same?

Argh. I'm getting confused between prepared statements and cursors.
So if the user does PREPARE followed by EXECUTE, then that is OK. The
problem is only if they use DECLARE .. CURSOR FOR, which your patch
doesn't affect.

So, committed.

And, I'm going to revert this part. If you'd run the regression tests
under force_parallel_mode=regress, max_parallel_degree>0, you would
have noticed that this part breaks it, because of CREATE TABLE ... AS
EXECUTE.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#6)

Re: Prepared Statement support for Parallel query

On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 25, 2016 at 1:09 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

After PREPARE, user will execute the statement using EXECUTE and
I don't see how user can decide number of rows to fetch which can
influence the execution. Can you please elaborate your point more
and what is your expectation for the same?

Argh. I'm getting confused between prepared statements and cursors.
So if the user does PREPARE followed by EXECUTE, then that is OK. The
problem is only if they use DECLARE .. CURSOR FOR, which your patch
doesn't affect.

So, committed.

And, I'm going to revert this part. If you'd run the regression tests
under force_parallel_mode=regress, max_parallel_degree>0, you would
have noticed that this part breaks it, because of CREATE TABLE ... AS
EXECUTE.

I will look into it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#6)

1 attachment(s)

Re: Prepared Statement support for Parallel query

On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

And, I'm going to revert this part. If you'd run the regression tests
under force_parallel_mode=regress, max_parallel_degree>0, you would
have noticed that this part breaks it, because of CREATE TABLE ... AS
EXECUTE.

I have looked into this issue and found that the reason for the failure is
that in force_parallel_mode=regress, we enable parallel mode restrictions
even if the parallel plan is not choosen as part of below code in
standard_planner()

if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)

{

glob->parallelModeNeeded = false;

glob->wholePlanParallelSafe = false; /* either false or don't care */

}

else

{

glob->parallelModeNeeded = true;

glob->wholePlanParallelSafe =

!has_parallel_hazard((Node *) parse, false);

}

The failure cases fall into that category, basically wholePlanParallelSafe
will be false, but parallelModeNeeded will be true which will enable
parallel mode restrictions even though the plan won't contain Gather node.
I think if we want to operate in above way for testing purpose, then we
need to force during execution that statements for non read-only operations
should not enter into parallel mode similar to what we are doing for
non-zero tuple count case. Attached patch fixes the problem.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

prepared_stmt_parallel_query_v2.patchapplication/octet-stream; name=prepared_stmt_parallel_query_v2.patchDownload

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index cec37ce..69a540e 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -159,7 +159,7 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString)
 					   nargs,
 					   NULL,
 					   NULL,
-					   0,		/* default cursor options */
+					   CURSOR_OPT_PARALLEL_OK,		/* allow parallel mode */
 					   true);	/* fixed result */
 
 	/*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 6872562..368c952 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1539,10 +1539,16 @@ ExecutePlan(EState *estate,
 	estate->es_direction = direction;
 
 	/*
-	 * If a tuple count was supplied, we must force the plan to run without
-	 * parallelism, because we might exit early.
-	 */
-	if (numberTuples)
+	 * Force the plan to run without parallelism, if the operation was not
+	 * a read-only operation or if a tuple count was supplied.  We can't run
+	 * in parallel mode if tuple count was supplied, because we might exit
+	 * early.
+	 */
+	if (numberTuples ||
+		dest->mydest == DestIntoRel ||
+		dest->mydest == DestCopyOut ||
+		dest->mydest == DestTransientRel ||
+		dest->mydest == DestTuplestore)
 		use_parallel_mode = false;
 
 	/*

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#8)

Re: Prepared Statement support for Parallel query

On Mon, Mar 14, 2016 at 9:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

And, I'm going to revert this part. If you'd run the regression tests
under force_parallel_mode=regress, max_parallel_degree>0, you would
have noticed that this part breaks it, because of CREATE TABLE ... AS
EXECUTE.

I have looked into this issue and found that the reason for the failure is
that in force_parallel_mode=regress, we enable parallel mode restrictions
even if the parallel plan is not choosen as part of below code in
standard_planner()

if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)

{

glob->parallelModeNeeded = false;

glob->wholePlanParallelSafe = false; /* either false or don't care */

}

else

{

glob->parallelModeNeeded = true;

glob->wholePlanParallelSafe =

!has_parallel_hazard((Node *) parse, false);

}

The failure cases fall into that category, basically wholePlanParallelSafe
will be false, but parallelModeNeeded will be true which will enable
parallel mode restrictions even though the plan won't contain Gather node.
I think if we want to operate in above way for testing purpose, then we need
to force during execution that statements for non read-only operations
should not enter into parallel mode similar to what we are doing for
non-zero tuple count case. Attached patch fixes the problem.

This seems like a really ugly fix. It might be possible to come up
with a fix along these lines, but I don't have much confidence in the
specific new test you've injected into executePlan(). Broadly, any
change of this type implicitly changes the contract between
executePlan() and the planner infrastructure - the planner can now
legally generate parallel plans in some cases where that would
previously have been unacceptable. But I'm not in a hurry to rethink
where we've drawn the line there for 9.6. Let's punt this issue for
now and come back to it in a future release.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#9)

Re: Prepared Statement support for Parallel query

On Tue, Mar 15, 2016 at 12:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 14, 2016 at 9:18 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

The failure cases fall into that category, basically

wholePlanParallelSafe

will be false, but parallelModeNeeded will be true which will enable
parallel mode restrictions even though the plan won't contain Gather

node.

I think if we want to operate in above way for testing purpose, then we

need

to force during execution that statements for non read-only operations
should not enter into parallel mode similar to what we are doing for
non-zero tuple count case. Attached patch fixes the problem.

This seems like a really ugly fix. It might be possible to come up
with a fix along these lines, but I don't have much confidence in the
specific new test you've injected into executePlan(). Broadly, any
change of this type implicitly changes the contract between
executePlan() and the planner infrastructure - the planner can now
legally generate parallel plans in some cases where that would
previously have been unacceptable. But I'm not in a hurry to rethink
where we've drawn the line there for 9.6. Let's punt this issue for
now and come back to it in a future release.

No issues. I felt that it might be good to support parallel query via
Prepare statement as there is no fundamental issue in the same, but as you
say, we can do that in future release as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com