Explain [Analyze] produces parallel scan for select Into table statements.
Hi All,
Explain [Analyze] Select Into table..... produces the plan which uses
parallel scans.
*Test:*
create table table1 (n int);
insert into table1 values (generate_series(1,5000000));
analyze table1;
set parallel_tuple_cost=0;
set max_parallel_degree=3;
postgres=# explain select into table2 from table1;
QUERY PLAN
-------------------------------------------------------------------------------
Gather (cost=1000.00..39253.03 rows=5000000 width=0)
Number of Workers: 3
-> Parallel Seq Scan on table1 (cost=0.00..38253.03 rows=1612903
width=0)
(3 rows)
-----------------------------
*So Explain Analyze Fails.*
postgres=# explain analyze select into table2 from table1;
ERROR: cannot insert tuples during a parallel operation
STATEMENT: explain analyze select into table2 from table1;
*But actual execution is successful.*
postgres=# select into table2 from table1;
SELECT 5000000
Reason is in ExplainOneQuery we unconditionally
pass CURSOR_OPT_PARALLEL_OK to pg_plan_query even if query might be from
CreateTableAs/ SelectInto. Whereas in *ExecCreateTableAs *it is always 0*.*
*Possible Fix:*
I tried to make a patch to fix this. Now in ExplainOneQuery if into clause
is
defined then parallel plans are disabled as similar to their execution.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
Attachments:
Analyze_select_into_disable_parallel_scan.patchapplication/octet-stream; name=Analyze_select_into_disable_parallel_scan.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee13136..d589b0d 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -346,8 +346,12 @@ ExplainOneQuery(Query *query, IntoClause *into, ExplainState *es,
INSTR_TIME_SET_CURRENT(planstart);
- /* plan the query */
- plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+ /*
+ * plan the query.
+ * Note: If Explain is for CreateTableAs / SelectInto Avoid parallel
+ * plans.
+ */
+ plan = pg_plan_query(query, into ? 0:CURSOR_OPT_PARALLEL_OK, params);
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
On Wed, Mar 9, 2016 at 8:18 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:
Hi All,
Explain [Analyze] Select Into table..... produces the plan which uses
parallel scans.
Possible Fix:
I tried to make a patch to fix this. Now in ExplainOneQuery if into
clause is
defined then parallel plans are disabled as similar to their execution.
- /* plan the query */
- plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+ /*
+ * plan the query.
+ * Note: If Explain is for CreateTableAs / SelectInto Avoid parallel
+ * plans.
+ */
+ plan = pg_plan_query(query, into ? 0:CURSOR_OPT_PARALLEL_OK, params);
There should be a white space between 0:CURSOR_OPT_PARALLEL_OK. Also I
don't see this comment is required as similar other usage doesn't have any
such comment. Fixed these two points in the attached patch.
In general, the patch looks good to me and solves the problem mentioned. I
have ran the regression tests with force_parallel_mode and doesn't see any
problem.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
Analyze_select_into_disable_parallel_scan_v2.patchapplication/octet-stream; name=Analyze_select_into_disable_parallel_scan_v2.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee13136..9cd3127 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -347,7 +347,7 @@ ExplainOneQuery(Query *query, IntoClause *into, ExplainState *es,
INSTR_TIME_SET_CURRENT(planstart);
/* plan the query */
- plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+ plan = pg_plan_query(query, into ? 0 : CURSOR_OPT_PARALLEL_OK, params);
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
On Thu, Mar 10, 2016 at 4:43 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
There should be a white space between 0:CURSOR_OPT_PARALLEL_OK. Also I
don't see this comment is required as similar other usage doesn't have any
such comment. Fixed these two points in the attached patch.In general, the patch looks good to me and solves the problem mentioned. I
have ran the regression tests with force_parallel_mode and doesn't see any
problem.
I guess there must not be an occurrence of this pattern in the
regression tests, or previous force_parallel_mode testing would have
found this problem. Perhaps this patch should add one?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 10, 2016 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I guess there must not be an occurrence of this pattern in the
regression tests, or previous force_parallel_mode testing would have
found this problem. Perhaps this patch should add one?
I have added the test to select_into.sql. Added Explain select into
statement.
Explain Analyze produces planning time and execution time even with TIMING
OFF
so not adding the same to regress tests.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
Attachments:
regress_test.patchtext/x-patch; charset=US-ASCII; name=regress_test.patchDownload
diff --git a/src/test/regress/expected/select_into.out b/src/test/regress/expected/select_into.out
index 9d3f047..bb71260 100644
--- a/src/test/regress/expected/select_into.out
+++ b/src/test/regress/expected/select_into.out
@@ -94,3 +94,16 @@ INSERT INTO b SELECT 1 INTO f;
ERROR: SELECT ... INTO is not allowed here
LINE 1: INSERT INTO b SELECT 1 INTO f;
^
+--
+-- EXPLAIN [ANALYZE] SELECT INTO should not use parallel scan.
+--
+CREATE TABLE mt1 (n INT);
+INSERT INTO mt1 VALUES (GENERATE_SERIES(1,5000000));
+ANALYZE mt1;
+EXPLAIN (COSTS OFF) SELECT INTO mt2 FROM mt1;
+ QUERY PLAN
+-----------------
+ Seq Scan on mt1
+(1 row)
+
+DROP TABLE mt1;
diff --git a/src/test/regress/sql/select_into.sql b/src/test/regress/sql/select_into.sql
index 4d1cc86..6eb5e24 100644
--- a/src/test/regress/sql/select_into.sql
+++ b/src/test/regress/sql/select_into.sql
@@ -76,3 +76,13 @@ COPY (SELECT 1 INTO frak UNION SELECT 2) TO 'blob';
SELECT * FROM (SELECT 1 INTO f) bar;
CREATE VIEW foo AS SELECT 1 INTO b;
INSERT INTO b SELECT 1 INTO f;
+
+--
+-- EXPLAIN [ANALYZE] SELECT INTO should not use parallel scan.
+--
+CREATE TABLE mt1 (n INT);
+INSERT INTO mt1 VALUES (GENERATE_SERIES(1,5000000));
+ANALYZE mt1;
+
+EXPLAIN (COSTS OFF) SELECT INTO mt2 FROM mt1;
+DROP TABLE mt1;
On Fri, Mar 11, 2016 at 3:34 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:
On Thu, Mar 10, 2016 at 9:39 PM, Robert Haas <robertmhaas@gmail.com>
wrote:
I guess there must not be an occurrence of this pattern in the
regression tests, or previous force_parallel_mode testing would have
found this problem. Perhaps this patch should add one?I have added the test to select_into.sql. Added Explain select into
statement.
I don't see how this test will fail with force_parallel_mode=regress and
max_parallel_degree > 0 even without the patch proposed to fix the issue in
hand. In short, I don't think this test would have caught the issue, so I
don't see much advantage in adding such a test. Even if we want to add
such a test case, I think as proposed this will substantially increase the
timing for "Select Into" test which might not be an acceptable test case
addition especially for testing one corner case.
Explain Analyze produces planning time and execution time even with
TIMING OFF
so not adding the same to regress tests.
Yeah, that makes the addition of test for this functionality difficult.
Robert, do you have any idea what kind of test would have caught this issue?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote
I don't see how this test will fail with force_parallel_mode=regress and
max_parallel_degree > 0 even without the patch proposed to fix the issue in
hand. In short, I don't think this test would have caught the issue, so I
don't see much advantage in adding such a test. Even if we want to add
such a >test case, I think as proposed this will substantially increase the
timing for "Select Into" test which might not be an acceptable test case
addition >especially for testing one corner case.
Without above patch the make installcheck fails for select_into.sql with
below diff
when
force_parallel_mode = on
max_parallel_degree = 3
diff results/select_into.out expected/select_into.out
104,110c104,107
< QUERY PLAN
< ------------------------
< Gather
< Number of Workers: 1
< Single Copy: true
< -> Seq Scan on mt1
< (4 rows)
---
QUERY PLAN
-----------------
Seq Scan on mt1
(1 row)
Again with postgresql.conf non default settings.
force_parallel_mode = on
max_parallel_degree = 3
parallel_tuple_cost = 0
[mithun@localhost regress]$ diff results/select_into.out
expected/select_into.out
104,109c104,107
< QUERY PLAN
< --------------------------------
< Gather
< Number of Workers: 3
< -> Parallel Seq Scan on mt1
< (3 rows)
---
QUERY PLAN
-----------------
Seq Scan on mt1
(1 row)
To reduce the time of execution I can set the generate_series parameter to
500, which is fast in my machine and also fails with above diff but this
time only one worker is assigned as per plan.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 2:02 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:
On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote
I don't see how this test will fail with force_parallel_mode=regress and
max_parallel_degree > 0 even without the patch proposed to fix the issue in
hand. In short, I don't think this test would have caught the issue, so I
don't see much advantage in adding such a test. Even if we want to add
such a >test case, I think as proposed this will substantially increase the
timing for "Select Into" test which might not be an acceptable test case
addition >especially for testing one corner case.
Without above patch the make installcheck fails for select_into.sql with
below diff
when
force_parallel_mode = on
max_parallel_degree = 3
With force_parallel_mode=on, I could see many other failures as well. I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
With force_parallel_mode=on, I could see many other failures as well. I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress
as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off. *
And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
With force_parallel_mode=on, I could see many other failures as well. I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress
as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off. *
And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
With force_parallel_mode=on, I could see many other failures as well. I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress
as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off. *
And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
Sorry there was some issue with my mail settings same mail got set more
than once.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 7:11 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:
On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
With force_parallel_mode=on, I could see many other failures as well. I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress
as per user manual.
Setting this value to regress has all of the same effects as setting it
to on plus some additional effect that are intended to facilitate automated
regression testing.
Yes, that is the only reason I mentioned that it better to have a test
which can be checked in automated way and I understand that the way you
have written test using Explain won't work in automated way, so not sure if
it is good idea to add such a test.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 12, 2016 at 1:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah, that makes the addition of test for this functionality difficult.
Robert, do you have any idea what kind of test would have caught this issue?
Yep. Committed with that test:
DO $$
BEGIN
EXECUTE 'EXPLAIN ANALYZE SELECT * INTO TABLE easi FROM int8_tbl';
END$$;
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 15, 2016 at 5:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Mar 12, 2016 at 1:58 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
Yeah, that makes the addition of test for this functionality difficult.
Robert, do you have any idea what kind of test would have caught this
issue?
Yep. Committed with that test:
Nice. Thanks!
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com