parallel explain analyze support not exercised

Started by Andres Freundalmost 9 years ago9 messages

andres@anarazel.de

almost 9 years ago

Hi,

As visible in [1]https://coverage.postgresql.org/src/backend/executor/execParallel.c.gcov.html, the explain analyze codepaths of parallel query isn't
exercised in the tests. That used to be not entirely trivial if the
output was to be displayed (due to timing), but we should be able to do
that now that we have the SUMMARY option.

E.g.
SET max_parallel_workers = 0;
EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM blarg2 WHERE generate_series < 0;
┌───────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────┤
│ Gather (actual rows=0 loops=1) │
│ Workers Planned: 10 │
│ Workers Launched: 0 │
│ -> Parallel Seq Scan on blarg2 (actual rows=0 loops=1) │
│ Filter: (generate_series < 0) │
│ Rows Removed by Filter: 10000000 │
└───────────────────────────────────────────────────────────┘

should be reproducible. I'd suggest additionally adding one tests that
throws the EXPLAIN output away, but actually enables paralellism.

Greetings,

Andres Freund

[1]: https://coverage.postgresql.org/src/backend/executor/execParallel.c.gcov.html

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Rafia Sabih

rafia.sabih@enterprisedb.com

almost 9 years ago

In reply to: Andres Freund (#1)

1 attachment(s)

Re: parallel explain analyze support not exercised

On Sat, Apr 1, 2017 at 12:25 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

As visible in [1], the explain analyze codepaths of parallel query isn't
exercised in the tests. That used to be not entirely trivial if the
output was to be displayed (due to timing), but we should be able to do
that now that we have the SUMMARY option.

E.g.
SET max_parallel_workers = 0;
EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM blarg2 WHERE generate_series < 0;
┌───────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────┤
│ Gather (actual rows=0 loops=1) │
│ Workers Planned: 10 │
│ Workers Launched: 0 │
│ -> Parallel Seq Scan on blarg2 (actual rows=0 loops=1) │
│ Filter: (generate_series < 0) │
│ Rows Removed by Filter: 10000000 │
└───────────────────────────────────────────────────────────┘

should be reproducible. I'd suggest additionally adding one tests that
throws the EXPLAIN output away, but actually enables paralellism.

Greetings,

Andres Freund

[1] https://coverage.postgresql.org/src/backend/executor/execParallel.c.gcov.html

Please find the attached for the same.

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachments:

code_coverage.patchapplication/octet-stream; name=code_coverage.patchDownload

diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 038a62efd7..c131214c10 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -252,6 +252,16 @@ explain (costs off)
          Index Cond: (unique1 = 1)
 (5 rows)
 
+-- to increase the parallel query test coverage
+EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Gather (actual rows=10000 loops=1)
+   Workers Planned: 4
+   Workers Launched: 4
+   ->  Parallel Seq Scan on tenk1 (actual rows=2000 loops=5)
+(4 rows)
+
 -- provoke error in worker
 select stringu1::int2 from tenk1 where unique1 = 1;
 ERROR:  invalid input syntax for integer: "BAAAAA"
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 9311a775af..e8b7d4f386 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -100,6 +100,9 @@ set force_parallel_mode=1;
 explain (costs off)
   select stringu1::int2 from tenk1 where unique1 = 1;
 
+-- to increase the parallel query test coverage
+EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1;
+
 -- provoke error in worker
 select stringu1::int2 from tenk1 where unique1 = 1;

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Rafia Sabih (#2)

Re: parallel explain analyze support not exercised

On 2017-04-03 10:26:27 +0530, Rafia Sabih wrote:

On Sat, Apr 1, 2017 at 12:25 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

As visible in [1], the explain analyze codepaths of parallel query isn't
exercised in the tests. That used to be not entirely trivial if the
output was to be displayed (due to timing), but we should be able to do
that now that we have the SUMMARY option.

E.g.
SET max_parallel_workers = 0;
EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM blarg2 WHERE generate_series < 0;
┌───────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────┤
│ Gather (actual rows=0 loops=1) │
│ Workers Planned: 10 │
│ Workers Launched: 0 │
│ -> Parallel Seq Scan on blarg2 (actual rows=0 loops=1) │
│ Filter: (generate_series < 0) │
│ Rows Removed by Filter: 10000000 │
└───────────────────────────────────────────────────────────┘

should be reproducible. I'd suggest additionally adding one tests that
throws the EXPLAIN output away, but actually enables paralellism.

Greetings,

Andres Freund

[1] https://coverage.postgresql.org/src/backend/executor/execParallel.c.gcov.html

Please find the attached for the same.

+-- to increase the parallel query test coverage
+EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Gather (actual rows=10000 loops=1)
+   Workers Planned: 4
+   Workers Launched: 4
+   ->  Parallel Seq Scan on tenk1 (actual rows=2000 loops=5)
+(4 rows)

Is there an issue that we might not actually be able to start all four
workers? Serious question, not rhetorical.

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Andres Freund (#3)

Re: parallel explain analyze support not exercised

On Mon, Apr 3, 2017 at 1:53 PM, Andres Freund <andres@anarazel.de> wrote:

Please find the attached for the same.

+-- to increase the parallel query test coverage
+EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1;
+                         QUERY PLAN
+-------------------------------------------------------------
+ Gather (actual rows=10000 loops=1)
+   Workers Planned: 4
+   Workers Launched: 4
+   ->  Parallel Seq Scan on tenk1 (actual rows=2000 loops=5)
+(4 rows)

Is there an issue that we might not actually be able to start all four
workers? Serious question, not rhetorical.

If this is 'make check', then we should have 8 parallel workers
allowed, so if we only do one of these at a time, then I think we're
OK. But if somebody changes that configuration setting or if it's
'make installcheck', then the configuration could be anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Robert Haas (#4)

Re: parallel explain analyze support not exercised

On 2017-04-03 15:13:13 -0400, Robert Haas wrote:

On Mon, Apr 3, 2017 at 1:53 PM, Andres Freund <andres@anarazel.de> wrote:
Please find the attached for the same.
+-- to increase the parallel query test coverage
+EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1;
+                         QUERY PLAN
+-------------------------------------------------------------
+ Gather (actual rows=10000 loops=1)
+   Workers Planned: 4
+   Workers Launched: 4
+   ->  Parallel Seq Scan on tenk1 (actual rows=2000 loops=5)
+(4 rows)
Is there an issue that we might not actually be able to start all four
workers? Serious question, not rhetorical.
If this is 'make check', then we should have 8 parallel workers
allowed, so if we only do one of these at a time, then I think we're
OK. But if somebody changes that configuration setting or if it's
'make installcheck', then the configuration could be anything.

Hm - we already rely on max_parallel_workers_per_gather being set with
some of the explains in the test. So I guess we're ok also relying on
actual workers being present?

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Andres Freund (#5)

Re: parallel explain analyze support not exercised

On Mon, Apr 3, 2017 at 3:31 PM, Andres Freund <andres@anarazel.de> wrote:

If this is 'make check', then we should have 8 parallel workers
allowed, so if we only do one of these at a time, then I think we're
OK. But if somebody changes that configuration setting or if it's
'make installcheck', then the configuration could be anything.

Hm - we already rely on max_parallel_workers_per_gather being set with
some of the explains in the test. So I guess we're ok also relying on
actual workers being present?

I'm not really sure about that one way or the other. Our policy on
which configurations are supported vis-a-vis 'make installcheck' seems
to be, essentially, that if a sufficiently-prominent community member
cares about it, then it ends up getting made to work, unless an
even-more-prominent community member objects. That's why, for
example, our regression tests pass in Czech. I can't begin to guess
whether breaking installcheck against configurations with low values
of max_parallel_workers or max_worker_processes will bother anybody.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Robert Haas (#6)

Re: parallel explain analyze support not exercised

Hi,

On 2017-04-03 17:11:33 -0400, Robert Haas wrote:

On Mon, Apr 3, 2017 at 3:31 PM, Andres Freund <andres@anarazel.de> wrote:

If this is 'make check', then we should have 8 parallel workers
allowed, so if we only do one of these at a time, then I think we're
OK. But if somebody changes that configuration setting or if it's
'make installcheck', then the configuration could be anything.

Hm - we already rely on max_parallel_workers_per_gather being set with
some of the explains in the test. So I guess we're ok also relying on
actual workers being present?

I'm not really sure about that one way or the other. Our policy on
which configurations are supported vis-a-vis 'make installcheck' seems
to be, essentially, that if a sufficiently-prominent community member
cares about it, then it ends up getting made to work, unless an
even-more-prominent community member objects. That's why, for
example, our regression tests pass in Czech. I can't begin to guess
whether breaking installcheck against configurations with low values
of max_parallel_workers or max_worker_processes will bother anybody.

I guess we'll have to see. My personal conclusion is that greater
coverage of parallelism is worth some very minor config trouble for
people doing installcheck against clusters with non-default config.

Thanks Rafia!

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andres Freund (#7)

Re: parallel explain analyze support not exercised

Andres Freund <andres@anarazel.de> writes:

I guess we'll have to see. My personal conclusion is that greater
coverage of parallelism is worth some very minor config trouble for
people doing installcheck against clusters with non-default config.

The buildfarm seems entirely unwilling to play along.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Tom Lane (#8)

Re: parallel explain analyze support not exercised

On 2017-04-06 17:33:49 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I guess we'll have to see. My personal conclusion is that greater
coverage of parallelism is worth some very minor config trouble for
people doing installcheck against clusters with non-default config.

The buildfarm seems entirely unwilling to play along.

That was the parallel bitmap scan test, not this (I think). I've
already pushed a fix which should address the issue - it at least does
so locally. Both Dilip and I had apparently forgotten that we disallow
setting effective_io_concurrency on platforms without USE_PREFETCH
support.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers