how does PostgreSQL determine how many parallel processes to start

Started by Luca Ferrariabout 5 years ago4 messagesgeneral

fluca1978@gmail.com

about 5 years ago

Hi all,
I know that parallel processes can be limited by
max_parallel_workers_per_gather and max_parallel_workers, as well as
the condition to consider a parallel plan is min_table_scan_size (and
index). But I would like to understand, once a table has been
considered for a parallel plan, and there is room for other workers,
how will PostgreSQL decide to start another process?

Thanks,
Luca

Laurenz Albe

laurenz.albe@cybertec.at

about 5 years ago

In reply to: Luca Ferrari (#1)

Re: how does PostgreSQL determine how many parallel processes to start

On Fri, 2021-02-19 at 10:38 +0100, Luca Ferrari wrote:

I know that parallel processes can be limited by
max_parallel_workers_per_gather and max_parallel_workers, as well as
the condition to consider a parallel plan is min_table_scan_size (and
index). But I would like to understand, once a table has been
considered for a parallel plan, and there is room for other workers,
how will PostgreSQL decide to start another process?

During planning, it will generate parallel and non-parallel plans
and take the one it estimates to be cheapest.

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Luca Ferrari

fluca1978@gmail.com

about 5 years ago

In reply to: Laurenz Albe (#2)

Re: how does PostgreSQL determine how many parallel processes to start

On Fri, Feb 19, 2021 at 10:43 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Thanks, but just to make it clear, assuming I execute almost
simultanously two identical queries that can be therefore be
parallelized, does that mean that the first one will be executed with
the max available parallele capacity and the second will "starve" on
parllelism being executed sequentially. Is this correct?
As a consequence to that, this also could mean that a query over a
small table could take more advanatge (in parallel sense) than a scan
on a larger table that was issued just a moment after (assuming both
table can be scanned in parallel), right?

Luca

Laurenz Albe

laurenz.albe@cybertec.at

about 5 years ago

In reply to: Luca Ferrari (#3)

Re: how does PostgreSQL determine how many parallel processes to start

On Fri, 2021-02-19 at 11:21 +0100, Luca Ferrari wrote:

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Thanks, but just to make it clear, assuming I execute almost
simultanously two identical queries that can be therefore be
parallelized, does that mean that the first one will be executed with
the max available parallele capacity and the second will "starve" on
parllelism being executed sequentially. Is this correct?
As a consequence to that, this also could mean that a query over a
small table could take more advanatge (in parallel sense) than a scan
on a larger table that was issued just a moment after (assuming both
table can be scanned in parallel), right?

Precisely. That is why you have "max_parallel_workers_per_gather"
to limit the number of parallel workers available to a single query.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com