Number of parallel workers chosen by the optimizer for parallel append

Started by Laurenz Albeover 5 years ago3 messagesgeneral
Jump to latest
#1Laurenz Albe
laurenz.albe@cybertec.at

I have a partitioned table, each partition has "parallel_workers = 10" set.

SET max_parallel_workers_per_gather = 8;

SET enable_partitionwise_aggregate = on;

EXPLAIN (COSTS OFF)
SELECT applicant_name, count(ipc_4)
FROM laurenz.z_flat
GROUP BY applicant_name;

QUERY PLAN
--------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> HashAggregate
Group Key: z_flat_3.applicant_name
-> Seq Scan on xyz_4 z_flat_3
-> HashAggregate
Group Key: z_flat.applicant_name
-> Seq Scan on xyz_1 z_flat
[8 more such partition scans]
(33 rows)

How does the optimizer decide to use 4 parallel workers?

No matter what I try, I cannot influence that number.

Yours,
Laurenz Albe

#2Michael Lewis
mlewis@entrata.com
In reply to: Laurenz Albe (#1)
Re: Number of parallel workers chosen by the optimizer for parallel append

What have you tried? Changing the relevant cost parameters I assume?
Nothing else going on that may be taking up those workers, right?

https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-PARALLEL-SETUP-COST

#3Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Laurenz Albe (#1)
Re: Number of parallel workers chosen by the optimizer for parallel append

On Wed, 2020-11-25 at 17:36 +0100, Laurenz Albe wrote:

I have a partitioned table, each partition has "parallel_workers = 10" set.

SET max_parallel_workers_per_gather = 8;

SET enable_partitionwise_aggregate = on;

EXPLAIN (COSTS OFF)
SELECT applicant_name, count(ipc_4)
FROM laurenz.z_flat
GROUP BY applicant_name;

QUERY PLAN
--------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> HashAggregate
Group Key: z_flat_3.applicant_name
-> Seq Scan on xyz_4 z_flat_3
-> HashAggregate
Group Key: z_flat.applicant_name
-> Seq Scan on xyz_1 z_flat
[8 more such partition scans]
(33 rows)

How does the optimizer decide to use 4 parallel workers?

No matter what I try, I cannot influence that number.

I figured it out.

This is automatically calculated from the number of partitions, and the
number of parallel workers is

ld(#partitions) + 1

where "ld" is the logarithm of base 2 (function "fls" in the source).

It might be nice to make this configurable, but since we don't have
storage parameters on partitioned tables, I wonder how.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com