how does PostgreSQL determine how many parallel processes to start

Started by Luca Ferrariabout 5 years ago4 messagesgeneral
Jump to latest
#1Luca Ferrari
fluca1978@gmail.com

Hi all,
I know that parallel processes can be limited by
max_parallel_workers_per_gather and max_parallel_workers, as well as
the condition to consider a parallel plan is min_table_scan_size (and
index). But I would like to understand, once a table has been
considered for a parallel plan, and there is room for other workers,
how will PostgreSQL decide to start another process?

Thanks,
Luca

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Luca Ferrari (#1)
Re: how does PostgreSQL determine how many parallel processes to start

On Fri, 2021-02-19 at 10:38 +0100, Luca Ferrari wrote:

I know that parallel processes can be limited by
max_parallel_workers_per_gather and max_parallel_workers, as well as
the condition to consider a parallel plan is min_table_scan_size (and
index). But I would like to understand, once a table has been
considered for a parallel plan, and there is room for other workers,
how will PostgreSQL decide to start another process?

During planning, it will generate parallel and non-parallel plans
and take the one it estimates to be cheapest.

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#3Luca Ferrari
fluca1978@gmail.com
In reply to: Laurenz Albe (#2)
Re: how does PostgreSQL determine how many parallel processes to start

On Fri, Feb 19, 2021 at 10:43 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Thanks, but just to make it clear, assuming I execute almost
simultanously two identical queries that can be therefore be
parallelized, does that mean that the first one will be executed with
the max available parallele capacity and the second will "starve" on
parllelism being executed sequentially. Is this correct?
As a consequence to that, this also could mean that a query over a
small table could take more advanatge (in parallel sense) than a scan
on a larger table that was issued just a moment after (assuming both
table can be scanned in parallel), right?

Luca

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Luca Ferrari (#3)
Re: how does PostgreSQL determine how many parallel processes to start

On Fri, 2021-02-19 at 11:21 +0100, Luca Ferrari wrote:

At execution time, PostgreSQL will use as many of the planned workers
as are currently available (max_parallel_workers).

Thanks, but just to make it clear, assuming I execute almost
simultanously two identical queries that can be therefore be
parallelized, does that mean that the first one will be executed with
the max available parallele capacity and the second will "starve" on
parllelism being executed sequentially. Is this correct?
As a consequence to that, this also could mean that a query over a
small table could take more advanatge (in parallel sense) than a scan
on a larger table that was issued just a moment after (assuming both
table can be scanned in parallel), right?

Precisely. That is why you have "max_parallel_workers_per_gather"
to limit the number of parallel workers available to a single query.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com